[Bugs] [Bug 1743988] Setting cluster.heal-timeout requires volume restart

Wed Aug 21 09:56:54 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1743988

Ravishankar N <ravishankar at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |Triaged
             Status|NEW                         |ASSIGNED
                 CC|                            |ravishankar at redhat.com
           Assignee|bugs at gluster.org            |ravishankar at redhat.com
              Flags|                            |needinfo?(glenk1973 at hotmail
                   |                            |.com)

--- Comment #2 from Ravishankar N <ravishankar at redhat.com> ---
Okay, so after some investigation, I don't think this is an issue. When you
change the heal-timeout, it does get propagated to the self-heal daemon. But
since the default value is 600 seconds, the threads that do the heal only wake
up after that time. Once it wakes up, subsequent runs do seem to honour the new
heal-timeout value.

On a glusterfs 6.5 setup:
#gluster v create testvol replica 2 127.0.0.2:/home/ravi/bricks/brick{1..2}
force
#gluster v set testvol client-log-level DEBUG
#gluster v start testvol
#gluster v set testvol heal-timeout 5
#tail -f /var/log/glusterfs/glustershd.log|grep finished
You don't see anything in the log yet about the crawls.
But once you manually launch heal, the threads are woken up and further crawls
happen every 5 seconds.
#gluster v heal testvol

Now in glustershd.log:
[2019-08-21 09:55:02.024160] D [MSGID: 0]
[afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished
index sweep on subvol testvol-client-0. 
[2019-08-21 09:55:02.024271] D [MSGID: 0]
[afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished
index sweep on subvol testvol-client-1.
[2019-08-21 09:55:08.023252] D [MSGID: 0]
[afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished
index sweep on subvol testvol-client-1.
[2019-08-21 09:55:08.023358] D [MSGID: 0]
[afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished
index sweep on subvol testvol-client-0.
[2019-08-21 09:55:14.024438] D [MSGID: 0]
[afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished
index sweep on subvol testvol-client-1.
[2019-08-21 09:55:14.024546] D [MSGID: 0]
[afr-self-heald.c:843:afr_shd_index_healer] 0-testvol-replicate-0: finished
index sweep on subvol testvol-client-0.

Glen, could you check if that works for you? i.e. after setting the
heal-timeout, manually launch heal via `gluster v heal testvol`.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.