[Gluster-users] gluster volume heal info hang on 3.10.6 & 3.12.11

Fri Sep 14 15:44:21 UTC 2018

I have made some progress:

Using gluster-3.12.1, "glfsheal vol0" is fine. It tells me some file
need to be heal

    Brick storage0-cluster0:/data/brk02/src
    <gfid:88dfb19a-92dd-4abd-bb15-4931e6e47cd9>
    <gfid:88dfb19a-92dd-4abd-bb15-4931e6e47cd9>/0001
    Status: Connected
    Number of entries: 2

The hanging problem was introduced in

    commit 94faf8c4a1f5a020593ca65f44d68e1ba2f632eb
    Author: karthik-us <ksubrahm at redhat.com>
    Date:   Wed Sep 20 18:30:21 2017 +0530

        cluster/afr: Sending subvol up/down events when subvol comes
up or goes down

        > BUG: 1493539

        (cherry picked from commit 3bbb4fe4b33dc3a3ed068ed2284077f2a4d8265a)

        Change-Id: I6580351b245d5f868e9ddc6a4eb4dd6afa3bb6ec
        BUG: 1492066
        Signed-off-by: karthik-us <ksubrahm at redhat.com>

But while I using HEAD of relase-3.12 branch and do "git revert
94faf8c4", "glfsheal vol0" still hangs,
so this is not a simple problem.

I found the bug related to "client.event-threads", because previously
I have the setting "client.event-threads: 6",
there was only latency log about vol0-client 0-5 and

    [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-5: Ping
latency is 29998ms

No ping latency to other brick in the log.

After I do "gluster volume set vol0 client.event-threads: 24"  and
install 3.12.12 (since I am debuging on 8x3 cluster),
running "glfsheal vol0" finished in 30seconds (the time is still acceptable).
And there was

    [2018-09-14 21:10:17.817764] D
[rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-23: Ping latency
is 29876ms

So it leads to a conclusion that connections to brick might be holding
lock in client event-thread.

My gluster volume info:

    Options Reconfigured:
    cluster.locking-scheme: full
    performance.open-behind: on
    cluster.lookup-unhashed: auto
    transport.address-family: inet
    cluster.lookup-optimize: on
    diagnostics.brick-log-level: TRACE
    performance.readdir-ahead: True
    performance.md-cache-timeout: 0
    cluster.server-quorum-type: none
    performance.client-io-threads: True
    performance.read-ahead: False
    features.cache-invalidation: False
    network.ping-timeout: 120
    cluster.quorum-type: auto
    nfs.disable: True
    performance.io-cache: False
    server.event-threads: 4
    cluster.rebal-throttle: lazy
    transport.keepalive: True
    performance.write-behind: True
    cluster.self-heal-daemon: enable
    client.event-threads: 6
    cluster.eager-lock: on
    diagnostics.client-log-level: INFO
    features.trash: off
    features.selinux: off

cc ksubrahm at redhat.com, if you have any clue for this problem.

Ning Adam <frostyplanet at gmail.com> 于2018年9月14日周五 下午3:45写道：
>
> (My last mail seemed to be dropped by the list, reposting in plain-text format)
>
> Hello:
>
> I have recently done a rolling upgraded on some of my 3.7.6 gluster
> (AFR + DHT) deploments to 3.10.6, having two problems afterwards:
> 1)  IO error occationally
> 2)  "gluster volume heal vol0 info" hangs after  decent amount of
> files written to the cluster  (vary from hours to days)
>
> All of my 7 upgraded 3.10 clusters  reproduced these problems,  and
> then I tried upgraded them to 3.12.11, but nothing improved.
> I have submited a bug on
> https://bugzilla.redhat.com/show_bug.cgi?id=1618932 ,  so far no one
> seams to look into it.
>
> The problem of heal hanging is most annoying because it makes me
> impossible to monitor my clusters. So I tried to look into the code
> and analyze with the DEBUG & TRACE log, found the following situation.
>
> 1) I have "network.ping-timeout: 120" in volume option . If I modify
> network.ping-timeout to 0,
> "gluster volume heal vol0 info" will finish eventually after 10~20
> minutes (8x3 bricks setup). And it says no file to heal.
> It takes even longer when cluster is larger (36x3 bricks setup).
> It does not matter even I stop all my IO or restart all glusterd &
> glusterfsd process.  heal info still hangs.
>
> "gluster volume status"  is all normal.  All my brick servers are in
> low latency LAN enviroment.
>
> 2) As one of my clusters having this problem is testing setup,
> currently I may do some debug there.
> I looked into heal/src/glfs-heal.c, compile the code and added some log,
> the initialize call on "ret = glfs_init (fs)" hangs.
> I traced to " ret = glfs_chdir (fs, "/");" and then "dht_lookup(
> path='/' )",  it never get to "dht_readdirp_cbk()"
>
> I stopped all client IO operation on my gluster volume and then ran
> "glfsheal vol0".
> I looked into /var/log/gluster/glfshead_vol0.log, there is messive
> looping of reconnect attempt on all bricks. eg:
>
>     [2018-09-14 05:56:09.395869] T [rpc-clnt.c:407:rpc_clnt_reconnect]
> 0-vol0-client-12: attempting reconnect
>     [2018-09-14 05:56:09.396160] T [socket.c:3125:socket_connect] (-->
> /lib/libglusterfs.so.0(_gf_log_callingfn+0x1ee)[0x7f10d4e79778] (-->
> /usr/lib/glus
>     terfs//rpc-transport/socket.so(+0xd442)[0x7f10c647c442] (-->
> /lib/libgfrpc.so.0(rpc_transport_connect+0x93)[0x7f10d437de43] (-->
> /lib/libgfrpc.so.0(r
>     pc_clnt_reconnect+0x136)[0x7f10d438121b] (-->
> /lib/libglusterfs.so.0(+0x3e4e9)[0x7f10d4e894e9] )))))
> 0-vol0-client-12: connect () called on transport
>      already connected
>     [2018-09-14 05:56:09.396194] T [rpc-clnt.c:407:rpc_clnt_reconnect]
> 0-vol0-client-10: attempting reconnect
>     [2018-09-14 05:56:09.396424] T [socket.c:3125:socket_connect] (-->
> /lib/libglusterfs.so.0(_gf_log_callingfn+0x1ee)[0x7f10d4e79778] (-->
> /usr/lib/glus
>     terfs//rpc-transport/socket.so(+0xd442)[0x7f10c647c442] (-->
> /lib/libgfrpc.so.0(rpc_transport_connect+0x93)[0x7f10d437de43] (-->
> /lib/libgfrpc.so.0(r
>     pc_clnt_reconnect+0x136)[0x7f10d438121b] (-->
> /lib/libglusterfs.so.0(+0x3e4e9)[0x7f10d4e894e9] )))))
> 0-vol0-client-10: connect () called on transport
>      already connected
>
> It's also weird that 0-vol0-client-5 was having huge ammount RPC ping latency
> (29 seconds it's not possible because simple ICMP pinging that server
> is less then 0.1 second)
>
>      root at storage0-cluster0:~# less /vgrep ping
> /var/log/glusterfs/glfsheal-vol0.log | grep laten
>     [2018-09-14 05:13:16.008866] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-0: Ping latency
> is 0ms
>     [2018-09-14 05:13:16.016669] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-1: Ping latency
> is 0ms
>     [2018-09-14 05:13:16.021769] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-2: Ping latency
> is 0ms
>     [2018-09-14 05:13:16.033761] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-3: Ping latency
> is 4ms
>     [2018-09-14 05:13:16.101962] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-4: Ping latency
> is 62ms
>     [2018-09-14 05:13:46.105502] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-5: Ping latency
> is 29996ms
>     [2018-09-14 05:20:13.895285] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-0: Ping latency
> is 0ms
>     [2018-09-14 05:20:13.901476] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-1: Ping latency
> is 0ms
>     [2018-09-14 05:20:13.908373] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-2: Ping latency
> is 0ms
>     [2018-09-14 05:20:13.918782] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-3: Ping latency
> is 4ms
>     [2018-09-14 05:20:13.987162] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-4: Ping latency
> is 62ms
>     [2018-09-14 05:20:43.992528] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-5: Ping latency
> is 29998ms
>     [2018-09-14 05:25:11.983259] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-0: Ping latency
> is 0ms
>     [2018-09-14 05:25:11.991199] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-1: Ping latency
> is 0ms
>     [2018-09-14 05:25:11.996933] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-2: Ping latency
> is 0ms
>     [2018-09-14 05:25:12.007867] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-3: Ping latency
> is 4ms
>     [2018-09-14 05:25:12.076057] D
> [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-vol0-client-4: Ping latency
> is 62ms
>
> I checked vol0-client-5, nothing special comparing to other brick. CPU
> is not high, disk has no problem.
> No warning or error log in  /var/log/gluster/brick/data-brk01-src.log,
>  After I turn on TRACE level log on brick,
> the logging is too much to analyze, and  no apparent difference
> against other brick.
>
> I hope someone can have some idea about this, or pointing the
> direction to look into may help.
>
> Thanks very much