[Gluster-users] afr-self-heald.c:479:afr_shd_index_sweep

Pranith Kumar Karampuri pkarampu at redhat.com
Thu Jun 29 14:54:17 UTC 2017


On Thu, Jun 29, 2017 at 8:12 PM, Paolo Margara <paolo.margara at polito.it>
wrote:

> Il 29/06/2017 16:27, Pranith Kumar Karampuri ha scritto:
>
>
>
> On Thu, Jun 29, 2017 at 7:48 PM, Paolo Margara <paolo.margara at polito.it>
> wrote:
>
>> Hi Pranith,
>>
>> I'm using this guide https://github.com/nixpanic/gl
>> usterdocs/blob/f6d48dc17f2cb6ee4680e372520ec3358641b2bc/
>> Upgrade-Guide/upgrade_to_3.8.md
>>
>> Definitely my fault, but I think that is better to specify somewhere that
>> restarting the service is not enough simply because in many other case,
>> with other services, is sufficient.
>>
> The steps include the following command before installing 3.8 as per the
> page (https://github.com/nixpanic/glusterdocs/blob/
> f6d48dc17f2cb6ee4680e372520ec3358641b2bc/Upgrade-Guide/
> upgrade_to_3.8.md#online-upgrade-procedure-for-servers)
> So I guess we have it covered?
>
> As I said it's my fault ;-)
>

Ah! sorry. Thanks for your mail!


>
>
>    - Stop all gluster services using the below command or through your
>    favorite way to stop them.
>    - killall glusterfs glusterfsd glusterd
>
>
>
>> Now I'm restarting every brick process (and waiting for the heal to
>> complete), this is fixing my problem.
>>
>> Many thanks for the help.
>>
>>
>> Greetings,
>>
>>     Paolo
>>
>> Il 29/06/2017 13:03, Pranith Kumar Karampuri ha scritto:
>>
>> Paolo,
>>       Which document did you follow for the upgrade? We can fix the
>> documentation if there are any issues.
>>
>> On Thu, Jun 29, 2017 at 2:07 PM, Ravishankar N <ravishankar at redhat.com>
>> wrote:
>>
>>> On 06/29/2017 01:08 PM, Paolo Margara wrote:
>>>
>>> Hi all,
>>>
>>> for the upgrade I followed this procedure:
>>>
>>>    - put node in maintenance mode (ensure no client are active)
>>>    - yum versionlock delete glusterfs*
>>>    - service glusterd stop
>>>    - yum update
>>>    - systemctl daemon-reload
>>>    - service glusterd start
>>>    - yum versionlock add glusterfs*
>>>    - gluster volume heal vm-images-repo full
>>>    - gluster volume heal vm-images-repo info
>>>
>>> on each server every time I ran 'gluster --version' to confirm the
>>> upgrade, at the end I ran 'gluster volume set all cluster.op-version 30800'.
>>>
>>> Today I've tried to manually kill a brick process on a non critical
>>> volume, after that into the log I see:
>>>
>>> [2017-06-29 07:03:50.074388] I [MSGID: 100030] [glusterfsd.c:2454:main]
>>> 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.8.12
>>> (args: /usr/sbin/glusterfsd -s virtnode-0-1-gluster --volfile-id
>>> iso-images-repo.virtnode-0-1-gluster.data-glusterfs-brick1b-iso-images-repo
>>> -p /var/lib/glusterd/vols/iso-images-repo/run/virtnode-0-1-glus
>>> ter-data-glusterfs-brick1b-iso-images-repo.pid -S
>>> /var/run/gluster/c779852c21e2a91eaabbdda3b9127262.socket --brick-name
>>> /data/glusterfs/brick1b/iso-images-repo -l
>>> /var/log/glusterfs/bricks/data-glusterfs-brick1b-iso-images-repo.log
>>> --xlator-option *-posix.glusterd-uuid=e93ebee7-5d95-4100-a9df-4a3e60134b73
>>> --brick-port 49163 --xlator-option iso-images-repo-server.listen-
>>> port=49163)
>>>
>>> I've checked after the restart and indeed now the directory
>>> 'entry-changes' is created, but why stopping the glusterd service has not
>>> stopped also the brick processes?
>>>
>>>
>>> Just stopping,upgrading and restarting glusterd does not restart the
>>> brick processes, You would need to kill all gluster processes on the node
>>> before upgrading.  After upgrading, when you restart glusterd, it will
>>> automatically spawn the rest of the gluster processes on that node.
>>>
>>>
>>> Now how can I recover from this issue? Restarting all brick processes is
>>> enough?
>>>
>>> Yes, but ensure there are no pending heals like Pranith mentioned.
>>> https://gluster.readthedocs.io/en/latest/Upgrade-Guide/upgrade_to_3.7/
>>> lists the steps for upgrade to 3.7 but the steps mentioned there are
>>> similar for any rolling upgrade.
>>>
>>> -Ravi
>>>
>>>
>>> Greetings,
>>>
>>>     Paolo Margara
>>>
>>> Il 28/06/2017 18:41, Pranith Kumar Karampuri ha scritto:
>>>
>>>
>>>
>>> On Wed, Jun 28, 2017 at 9:45 PM, Ravishankar N <ravishankar at redhat.com>
>>> wrote:
>>>
>>>> On 06/28/2017 06:52 PM, Paolo Margara wrote:
>>>>
>>>>> Hi list,
>>>>>
>>>>> yesterday I noted the following lines into the glustershd.log log file:
>>>>>
>>>>> [2017-06-28 11:53:05.000890] W [MSGID: 108034]
>>>>> [afr-self-heald.c:479:afr_shd_index_sweep]
>>>>> 0-iso-images-repo-replicate-0: unable to get index-dir on
>>>>> iso-images-repo-client-0
>>>>> [2017-06-28 11:53:05.001146] W [MSGID: 108034]
>>>>> [afr-self-heald.c:479:afr_shd_index_sweep]
>>>>> 0-vm-images-repo-replicate-0:
>>>>> unable to get index-dir on vm-images-repo-client-0
>>>>> [2017-06-28 11:53:06.001141] W [MSGID: 108034]
>>>>> [afr-self-heald.c:479:afr_shd_index_sweep]
>>>>> 0-hosted-engine-replicate-0:
>>>>> unable to get index-dir on hosted-engine-client-0
>>>>> [2017-06-28 11:53:08.001094] W [MSGID: 108034]
>>>>> [afr-self-heald.c:479:afr_shd_index_sweep]
>>>>> 0-vm-images-repo-replicate-2:
>>>>> unable to get index-dir on vm-images-repo-client-6
>>>>> [2017-06-28 11:53:08.001170] W [MSGID: 108034]
>>>>> [afr-self-heald.c:479:afr_shd_index_sweep]
>>>>> 0-vm-images-repo-replicate-1:
>>>>> unable to get index-dir on vm-images-repo-client-3
>>>>>
>>>>> Digging into the mailing list archive I've found another user with a
>>>>> similar issue (the thread was '[Gluster-users] glustershd: unable to
>>>>> get
>>>>> index-dir on myvolume-client-0'), the solution suggested was to verify
>>>>> if the  /<path-to-backend-brick>/.glusterfs/indices directory contains
>>>>> all these sub directories: 'dirty', 'entry-changes' and 'xattrop' and
>>>>> if
>>>>> some of them does not exists simply create it with mkdir.
>>>>>
>>>>> In my case the 'entry-changes' directory is not present on all the
>>>>> bricks and on all the servers:
>>>>>
>>>>> /data/glusterfs/brick1a/hosted-engine/.glusterfs/indices/:
>>>>> total 0
>>>>> drw------- 2 root root 55 Jun 28 15:02 dirty
>>>>> drw------- 2 root root 57 Jun 28 15:02 xattrop
>>>>>
>>>>> /data/glusterfs/brick1b/iso-images-repo/.glusterfs/indices/:
>>>>> total 0
>>>>> drw------- 2 root root 55 May 29 14:04 dirty
>>>>> drw------- 2 root root 57 May 29 14:04 xattrop
>>>>>
>>>>> /data/glusterfs/brick2/vm-images-repo/.glusterfs/indices/:
>>>>> total 0
>>>>> drw------- 2 root root 112 Jun 28 15:02 dirty
>>>>> drw------- 2 root root  66 Jun 28 15:02 xattrop
>>>>>
>>>>> /data/glusterfs/brick3/vm-images-repo/.glusterfs/indices/:
>>>>> total 0
>>>>> drw------- 2 root root 64 Jun 28 15:02 dirty
>>>>> drw------- 2 root root 66 Jun 28 15:02 xattrop
>>>>>
>>>>> /data/glusterfs/brick4/vm-images-repo/.glusterfs/indices/:
>>>>> total 0
>>>>> drw------- 2 root root 112 Jun 28 15:02 dirty
>>>>> drw------- 2 root root  66 Jun 28 15:02 xattrop
>>>>>
>>>>> I've recently upgraded gluster from 3.7.16 to 3.8.12 with the rolling
>>>>> upgrade procedure and I haven't noted this issue prior of the update,
>>>>> on
>>>>> another system upgraded with the same procedure I haven't encountered
>>>>> this problem.
>>>>>
>>>>> Currently all VM images appear to be OK but prior to create the
>>>>> 'entry-changes' I would like to ask if this is still the correct
>>>>> procedure to fix this issue
>>>>>
>>>>
>>>> Did you restart the bricks after the upgrade? That should have created
>>>> the entry-changes directory. Can you kill the brick and restart it and see
>>>> if the dir is created? Double check from the brick logs that you're indeed
>>>> running 3.12:  "Started running /usr/local/sbin/glusterfsd version 3.8.12"
>>>> should appear when the brick starts.
>>>>
>>>
>>> Please note that if you are going the route of killing and restarting,
>>> you need to do it in the same way you did rolling upgrade. You need to wait
>>> for heal to complete before you kill the other nodes. But before you do
>>> this, it is better you look at the logs or confirm the steps you used for
>>> doing upgrade.
>>>
>>>
>>>>
>>>> -Ravi
>>>>
>>>>
>>>>   and if this problem could have affected the
>>>>> heal operations occurred meanwhile.
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> Greetings,
>>>>>
>>>>>      Paolo Margara
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>>
>>>
>>> --
>>> Pranith
>>>
>>> --
>> Pranith
>>
>>
>
>
> --
> Pranith
>
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170629/d76a254a/attachment.html>


More information about the Gluster-users mailing list