[Gluster-users] afr-self-heald.c:479:afr_shd_index_sweep
Paolo Margara
paolo.margara at polito.it
Thu Jun 29 14:42:58 UTC 2017
Il 29/06/2017 16:27, Pranith Kumar Karampuri ha scritto:
>
>
> On Thu, Jun 29, 2017 at 7:48 PM, Paolo Margara
> <paolo.margara at polito.it <mailto:paolo.margara at polito.it>> wrote:
>
> Hi Pranith,
>
> I'm using this guide
> https://github.com/nixpanic/glusterdocs/blob/f6d48dc17f2cb6ee4680e372520ec3358641b2bc/Upgrade-Guide/upgrade_to_3.8.md
> <https://github.com/nixpanic/glusterdocs/blob/f6d48dc17f2cb6ee4680e372520ec3358641b2bc/Upgrade-Guide/upgrade_to_3.8.md>
>
> Definitely my fault, but I think that is better to specify
> somewhere that restarting the service is not enough simply because
> in many other case, with other services, is sufficient.
>
> The steps include the following command before installing 3.8 as per
> the page
> (https://github.com/nixpanic/glusterdocs/blob/f6d48dc17f2cb6ee4680e372520ec3358641b2bc/Upgrade-Guide/upgrade_to_3.8.md#online-upgrade-procedure-for-servers)
> So I guess we have it covered?
As I said it's my fault ;-)
>
> * Stop all gluster services using the below command or through your
> favorite way to stop them.
> * killall glusterfs glusterfsd glusterd
>
>
>
> Now I'm restarting every brick process (and waiting for the heal
> to complete), this is fixing my problem.
>
> Many thanks for the help.
>
>
> Greetings,
>
> Paolo
>
>
> Il 29/06/2017 13:03, Pranith Kumar Karampuri ha scritto:
>> Paolo,
>> Which document did you follow for the upgrade? We can fix
>> the documentation if there are any issues.
>>
>> On Thu, Jun 29, 2017 at 2:07 PM, Ravishankar N
>> <ravishankar at redhat.com <mailto:ravishankar at redhat.com>> wrote:
>>
>> On 06/29/2017 01:08 PM, Paolo Margara wrote:
>>>
>>> Hi all,
>>>
>>> for the upgrade I followed this procedure:
>>>
>>> * put node in maintenance mode (ensure no client are active)
>>> * yum versionlock delete glusterfs*
>>> * service glusterd stop
>>> * yum update
>>> * systemctl daemon-reload
>>> * service glusterd start
>>> * yum versionlock add glusterfs*
>>> * gluster volume heal vm-images-repo full
>>> * gluster volume heal vm-images-repo info
>>>
>>> on each server every time I ran 'gluster --version' to
>>> confirm the upgrade, at the end I ran 'gluster volume set
>>> all cluster.op-version 30800'.
>>>
>>> Today I've tried to manually kill a brick process on a non
>>> critical volume, after that into the log I see:
>>>
>>> [2017-06-29 07:03:50.074388] I [MSGID: 100030]
>>> [glusterfsd.c:2454:main] 0-/usr/sbin/glusterfsd: Started
>>> running /usr/sbin/glusterfsd version 3.8.12 (args:
>>> /usr/sbin/glusterfsd -s virtnode-0-1-gluster --volfile-id
>>> iso-images-repo.virtnode-0-1-gluster.data-glusterfs-brick1b-iso-images-repo
>>> -p
>>> /var/lib/glusterd/vols/iso-images-repo/run/virtnode-0-1-gluster-data-glusterfs-brick1b-iso-images-repo.pid
>>> -S /var/run/gluster/c779852c21e2a91eaabbdda3b9127262.socket
>>> --brick-name /data/glusterfs/brick1b/iso-images-repo -l
>>> /var/log/glusterfs/bricks/data-glusterfs-brick1b-iso-images-repo.log
>>> --xlator-option
>>> *-posix.glusterd-uuid=e93ebee7-5d95-4100-a9df-4a3e60134b73
>>> --brick-port 49163 --xlator-option
>>> iso-images-repo-server.listen-port=49163)
>>>
>>> I've checked after the restart and indeed now the directory
>>> 'entry-changes' is created, but why stopping the glusterd
>>> service has not stopped also the brick processes?
>>>
>>
>> Just stopping,upgrading and restarting glusterd does not
>> restart the brick processes, You would need to kill all
>> gluster processes on the node before upgrading. After
>> upgrading, when you restart glusterd, it will automatically
>> spawn the rest of the gluster processes on that node.
>>
>>>
>>> Now how can I recover from this issue? Restarting all brick
>>> processes is enough?
>>>
>> Yes, but ensure there are no pending heals like Pranith
>> mentioned.
>> https://gluster.readthedocs.io/en/latest/Upgrade-Guide/upgrade_to_3.7/
>> <https://gluster.readthedocs.io/en/latest/Upgrade-Guide/upgrade_to_3.7/>
>> lists the steps for upgrade to 3.7 but the steps mentioned
>> there are similar for any rolling upgrade.
>>
>> -Ravi
>>
>>>
>>> Greetings,
>>>
>>> Paolo Margara
>>>
>>>
>>> Il 28/06/2017 18:41, Pranith Kumar Karampuri ha scritto:
>>>>
>>>>
>>>> On Wed, Jun 28, 2017 at 9:45 PM, Ravishankar N
>>>> <ravishankar at redhat.com <mailto:ravishankar at redhat.com>> wrote:
>>>>
>>>> On 06/28/2017 06:52 PM, Paolo Margara wrote:
>>>>
>>>> Hi list,
>>>>
>>>> yesterday I noted the following lines into the
>>>> glustershd.log log file:
>>>>
>>>> [2017-06-28 11:53:05.000890] W [MSGID: 108034]
>>>> [afr-self-heald.c:479:afr_shd_index_sweep]
>>>> 0-iso-images-repo-replicate-0: unable to get
>>>> index-dir on
>>>> iso-images-repo-client-0
>>>> [2017-06-28 11:53:05.001146] W [MSGID: 108034]
>>>> [afr-self-heald.c:479:afr_shd_index_sweep]
>>>> 0-vm-images-repo-replicate-0:
>>>> unable to get index-dir on vm-images-repo-client-0
>>>> [2017-06-28 11:53:06.001141] W [MSGID: 108034]
>>>> [afr-self-heald.c:479:afr_shd_index_sweep]
>>>> 0-hosted-engine-replicate-0:
>>>> unable to get index-dir on hosted-engine-client-0
>>>> [2017-06-28 11:53:08.001094] W [MSGID: 108034]
>>>> [afr-self-heald.c:479:afr_shd_index_sweep]
>>>> 0-vm-images-repo-replicate-2:
>>>> unable to get index-dir on vm-images-repo-client-6
>>>> [2017-06-28 11:53:08.001170] W [MSGID: 108034]
>>>> [afr-self-heald.c:479:afr_shd_index_sweep]
>>>> 0-vm-images-repo-replicate-1:
>>>> unable to get index-dir on vm-images-repo-client-3
>>>>
>>>> Digging into the mailing list archive I've found
>>>> another user with a
>>>> similar issue (the thread was '[Gluster-users]
>>>> glustershd: unable to get
>>>> index-dir on myvolume-client-0'), the solution
>>>> suggested was to verify
>>>> if the /<path-to-backend-brick>/.glusterfs/indices
>>>> directory contains
>>>> all these sub directories: 'dirty', 'entry-changes'
>>>> and 'xattrop' and if
>>>> some of them does not exists simply create it with
>>>> mkdir.
>>>>
>>>> In my case the 'entry-changes' directory is not
>>>> present on all the
>>>> bricks and on all the servers:
>>>>
>>>> /data/glusterfs/brick1a/hosted-engine/.glusterfs/indices/:
>>>> total 0
>>>> drw------- 2 root root 55 Jun 28 15:02 dirty
>>>> drw------- 2 root root 57 Jun 28 15:02 xattrop
>>>>
>>>> /data/glusterfs/brick1b/iso-images-repo/.glusterfs/indices/:
>>>> total 0
>>>> drw------- 2 root root 55 May 29 14:04 dirty
>>>> drw------- 2 root root 57 May 29 14:04 xattrop
>>>>
>>>> /data/glusterfs/brick2/vm-images-repo/.glusterfs/indices/:
>>>> total 0
>>>> drw------- 2 root root 112 Jun 28 15:02 dirty
>>>> drw------- 2 root root 66 Jun 28 15:02 xattrop
>>>>
>>>> /data/glusterfs/brick3/vm-images-repo/.glusterfs/indices/:
>>>> total 0
>>>> drw------- 2 root root 64 Jun 28 15:02 dirty
>>>> drw------- 2 root root 66 Jun 28 15:02 xattrop
>>>>
>>>> /data/glusterfs/brick4/vm-images-repo/.glusterfs/indices/:
>>>> total 0
>>>> drw------- 2 root root 112 Jun 28 15:02 dirty
>>>> drw------- 2 root root 66 Jun 28 15:02 xattrop
>>>>
>>>> I've recently upgraded gluster from 3.7.16 to
>>>> 3.8.12 with the rolling
>>>> upgrade procedure and I haven't noted this issue
>>>> prior of the update, on
>>>> another system upgraded with the same procedure I
>>>> haven't encountered
>>>> this problem.
>>>>
>>>> Currently all VM images appear to be OK but prior
>>>> to create the
>>>> 'entry-changes' I would like to ask if this is
>>>> still the correct
>>>> procedure to fix this issue
>>>>
>>>>
>>>> Did you restart the bricks after the upgrade? That
>>>> should have created the entry-changes directory. Can
>>>> you kill the brick and restart it and see if the dir is
>>>> created? Double check from the brick logs that you're
>>>> indeed running 3.12: "Started running
>>>> /usr/local/sbin/glusterfsd version 3.8.12" should
>>>> appear when the brick starts.
>>>>
>>>>
>>>> Please note that if you are going the route of killing and
>>>> restarting, you need to do it in the same way you did
>>>> rolling upgrade. You need to wait for heal to complete
>>>> before you kill the other nodes. But before you do this, it
>>>> is better you look at the logs or confirm the steps you
>>>> used for doing upgrade.
>>>>
>>>>
>>>>
>>>> -Ravi
>>>>
>>>>
>>>> and if this problem could have affected the
>>>> heal operations occurred meanwhile.
>>>>
>>>> Thanks.
>>>>
>>>>
>>>> Greetings,
>>>>
>>>> Paolo Margara
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> <mailto:Gluster-users at gluster.org>
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>> <http://lists.gluster.org/mailman/listinfo/gluster-users>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> <mailto:Gluster-users at gluster.org>
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>> <http://lists.gluster.org/mailman/listinfo/gluster-users>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Pranith
>>
>> --
>> Pranith
>
>
>
>
> --
> Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170629/93e53240/attachment.html>
More information about the Gluster-users
mailing list