[Gluster-users] afr-self-heald.c:479:afr_shd_index_sweep
Paolo Margara
paolo.margara at polito.it
Thu Jun 29 07:38:03 UTC 2017
Hi all,
for the upgrade I followed this procedure:
* put node in maintenance mode (ensure no client are active)
* yum versionlock delete glusterfs*
* service glusterd stop
* yum update
* systemctl daemon-reload
* service glusterd start
* yum versionlock add glusterfs*
* gluster volume heal vm-images-repo full
* gluster volume heal vm-images-repo info
on each server every time I ran 'gluster --version' to confirm the
upgrade, at the end I ran 'gluster volume set all cluster.op-version 30800'.
Today I've tried to manually kill a brick process on a non critical
volume, after that into the log I see:
[2017-06-29 07:03:50.074388] I [MSGID: 100030] [glusterfsd.c:2454:main]
0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version
3.8.12 (args: /usr/sbin/glusterfsd -s virtnode-0-1-gluster --volfile-id
iso-images-repo.virtnode-0-1-gluster.data-glusterfs-brick1b-iso-images-repo
-p
/var/lib/glusterd/vols/iso-images-repo/run/virtnode-0-1-gluster-data-glusterfs-brick1b-iso-images-repo.pid
-S /var/run/gluster/c779852c21e2a91eaabbdda3b9127262.socket --brick-name
/data/glusterfs/brick1b/iso-images-repo -l
/var/log/glusterfs/bricks/data-glusterfs-brick1b-iso-images-repo.log
--xlator-option
*-posix.glusterd-uuid=e93ebee7-5d95-4100-a9df-4a3e60134b73 --brick-port
49163 --xlator-option iso-images-repo-server.listen-port=49163)
I've checked after the restart and indeed now the directory
'entry-changes' is created, but why stopping the glusterd service has
not stopped also the brick processes?
Now how can I recover from this issue? Restarting all brick processes is
enough?
Greetings,
Paolo Margara
Il 28/06/2017 18:41, Pranith Kumar Karampuri ha scritto:
>
>
> On Wed, Jun 28, 2017 at 9:45 PM, Ravishankar N <ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>> wrote:
>
> On 06/28/2017 06:52 PM, Paolo Margara wrote:
>
> Hi list,
>
> yesterday I noted the following lines into the glustershd.log
> log file:
>
> [2017-06-28 11:53:05.000890] W [MSGID: 108034]
> [afr-self-heald.c:479:afr_shd_index_sweep]
> 0-iso-images-repo-replicate-0: unable to get index-dir on
> iso-images-repo-client-0
> [2017-06-28 11:53:05.001146] W [MSGID: 108034]
> [afr-self-heald.c:479:afr_shd_index_sweep]
> 0-vm-images-repo-replicate-0:
> unable to get index-dir on vm-images-repo-client-0
> [2017-06-28 11:53:06.001141] W [MSGID: 108034]
> [afr-self-heald.c:479:afr_shd_index_sweep]
> 0-hosted-engine-replicate-0:
> unable to get index-dir on hosted-engine-client-0
> [2017-06-28 11:53:08.001094] W [MSGID: 108034]
> [afr-self-heald.c:479:afr_shd_index_sweep]
> 0-vm-images-repo-replicate-2:
> unable to get index-dir on vm-images-repo-client-6
> [2017-06-28 11:53:08.001170] W [MSGID: 108034]
> [afr-self-heald.c:479:afr_shd_index_sweep]
> 0-vm-images-repo-replicate-1:
> unable to get index-dir on vm-images-repo-client-3
>
> Digging into the mailing list archive I've found another user
> with a
> similar issue (the thread was '[Gluster-users] glustershd:
> unable to get
> index-dir on myvolume-client-0'), the solution suggested was
> to verify
> if the /<path-to-backend-brick>/.glusterfs/indices directory
> contains
> all these sub directories: 'dirty', 'entry-changes' and
> 'xattrop' and if
> some of them does not exists simply create it with mkdir.
>
> In my case the 'entry-changes' directory is not present on all the
> bricks and on all the servers:
>
> /data/glusterfs/brick1a/hosted-engine/.glusterfs/indices/:
> total 0
> drw------- 2 root root 55 Jun 28 15:02 dirty
> drw------- 2 root root 57 Jun 28 15:02 xattrop
>
> /data/glusterfs/brick1b/iso-images-repo/.glusterfs/indices/:
> total 0
> drw------- 2 root root 55 May 29 14:04 dirty
> drw------- 2 root root 57 May 29 14:04 xattrop
>
> /data/glusterfs/brick2/vm-images-repo/.glusterfs/indices/:
> total 0
> drw------- 2 root root 112 Jun 28 15:02 dirty
> drw------- 2 root root 66 Jun 28 15:02 xattrop
>
> /data/glusterfs/brick3/vm-images-repo/.glusterfs/indices/:
> total 0
> drw------- 2 root root 64 Jun 28 15:02 dirty
> drw------- 2 root root 66 Jun 28 15:02 xattrop
>
> /data/glusterfs/brick4/vm-images-repo/.glusterfs/indices/:
> total 0
> drw------- 2 root root 112 Jun 28 15:02 dirty
> drw------- 2 root root 66 Jun 28 15:02 xattrop
>
> I've recently upgraded gluster from 3.7.16 to 3.8.12 with the
> rolling
> upgrade procedure and I haven't noted this issue prior of the
> update, on
> another system upgraded with the same procedure I haven't
> encountered
> this problem.
>
> Currently all VM images appear to be OK but prior to create the
> 'entry-changes' I would like to ask if this is still the correct
> procedure to fix this issue
>
>
> Did you restart the bricks after the upgrade? That should have
> created the entry-changes directory. Can you kill the brick and
> restart it and see if the dir is created? Double check from the
> brick logs that you're indeed running 3.12: "Started running
> /usr/local/sbin/glusterfsd version 3.8.12" should appear when the
> brick starts.
>
>
> Please note that if you are going the route of killing and restarting,
> you need to do it in the same way you did rolling upgrade. You need to
> wait for heal to complete before you kill the other nodes. But before
> you do this, it is better you look at the logs or confirm the steps
> you used for doing upgrade.
>
>
>
> -Ravi
>
>
> and if this problem could have affected the
> heal operations occurred meanwhile.
>
> Thanks.
>
>
> Greetings,
>
> Paolo Margara
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> http://lists.gluster.org/mailman/listinfo/gluster-users
> <http://lists.gluster.org/mailman/listinfo/gluster-users>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> http://lists.gluster.org/mailman/listinfo/gluster-users
> <http://lists.gluster.org/mailman/listinfo/gluster-users>
>
>
>
>
> --
> Pranith
--
LABINF - HPC at POLITO
DAUIN - Politecnico di Torino
Corso Castelfidardo, 34D - 10129 Torino (TO)
phone: +39 011 090 7051
site: http://www.labinf.polito.it/
site: http://hpc.polito.it/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170629/0a7f8ae0/attachment.html>
More information about the Gluster-users
mailing list