[Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request

Erik Jacobson erik.jacobson at hpe.com
Wed Apr 1 03:27:42 UTC 2020


THANK YOU for the hints. Very happy to have the help.

I'll reply to a couple things then dig in:

On Tue, Mar 31, 2020 at 03:27:59PM +0530, Ravishankar N wrote:
> From your reply in the other thread, I'm assuming that the file/gfid in
> question is not in genuine split-brain or needing heal. i.e. for example

Right, they were not tagged split-brain either, just healing needed,
which is expected for those 76 files.

> with that 1 brick down and 2 bricks up test case, if you tried to read the
> file from say a temporary fuse mount (which is also now connected to only to
> 2 bricks since the 3rd one is down) it works fine and there is no EIO
> error...

Looking at the heal info, all files are the files I expected to have
write changes and I think* are outside the scope of this issue. To
close the loop, I ran a 'strings' on the top of one the files to confirm
from a fuse mount and had no trouble.

> ...which means that what you have observed is true, i.e.
> afr_read_txn_refresh_done() is called with err=EIO. You can add logs to see
> at what point it is EIO set. The call graph is like this: afr_inode_refresh_done()-->afr_txn_refresh_done()-->afr_read_txn_refresh_done().
> 
> Maybe https://github.com/gluster/glusterfs/blob/v7.4/xlators/cluster/afr/src/afr-common.c#L1188
> in afr_txn_refresh_done() is causing it either due to ret being -EIO or
> event_generation being zero.
> 
> If you are comfortable with gdb, you an put a conditional break point in
> afr_read_txn_refresh_done() at https://github.com/gluster/glusterfs/blob/v7.4/xlators/cluster/afr/src/afr-read-txn.c#L283
> when err=EIO and then check the backtrace for who is setting err to EIO.

Ok so the main event! :)

I'm not a gdb expert but I think I figured it out well enough to paste
some back traces. However, I'm having trouble intepreting them exactly.
It looks to me to be the "event" case.

(I got permission to use this MFG system at night for a couple more
nights; avoiding the 24-hour-reserved internal larger system we have).

here is what I did, feel free to suggest something better.

- I am using an RPM build so I changed the spec file to create debuginfo
  packages. I'm on rhel8.1
- I installed the updated packages and debuginfo packages
- When glusterd started the nfs glusterfs, I killed it.
- I ran this:
gdb -d /root/rpmbuild/BUILD/glusterfs-7.2 -d /root/rpmbuild/BUILD/glusterfs-7.2/xlators/cluster/afr/src/ /usr/sbin/glusterfs

- Then from GDB, I ran this:
(gdb) run -s localhost --volfile-id gluster/nfs -p /var/run/gluster/nfs/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/9ddb5561058ff543.socket -N

- I hit ctrl-c, then set the break point:
(gdb) break xlators/cluster/afr/src/afr-read-txn.c:280 if err == 5
- I have some debugging statements but gluster 72 line 280 is this:
    -->  line 280 (I think gdb changed it to 281 internally)
    if (err) {
        if (!priv->thin_arbiter_count) {

- continue

- Then I ran the test case.


Here are some back traces. They make my head hurt. Maybe you can suggest
something else to try next? In the morning I'll try to unwind this
myself too in the source code but I suspect it will be tough for me.


(gdb) break xlators/cluster/afr/src/afr-read-txn.c:280 if err == 5
Breakpoint 1 at 0x7fff688e057b: file afr-read-txn.c, line 281.
(gdb) continue
Continuing.
[Switching to Thread 0x7ffecffff700 (LWP 50175)]

Thread 15 "glfs_epoll007" hit Breakpoint 1, afr_read_txn_refresh_done (
    frame=0x7fff48325d78, this=0x7fff640137b0, err=5) at afr-read-txn.c:281
281	    if (err) {
(gdb) bt
#0  afr_read_txn_refresh_done (frame=0x7fff48325d78, this=0x7fff640137b0, 
    err=5) at afr-read-txn.c:281
#1  0x00007fff68901fdb in afr_txn_refresh_done (
    frame=frame at entry=0x7fff48325d78, this=this at entry=0x7fff640137b0, err=5, 
    err at entry=0) at afr-common.c:1223
#2  0x00007fff689022b3 in afr_inode_refresh_done (
    frame=frame at entry=0x7fff48325d78, this=this at entry=0x7fff640137b0, error=0)
    at afr-common.c:1295
#3  0x00007fff6890f3fb in afr_inode_refresh_subvol_cbk (frame=0x7fff48325d78, 
    cookie=<optimized out>, this=0x7fff640137b0, op_ret=<optimized out>, 
    op_errno=<optimized out>, buf=buf at entry=0x7ffecfffdaa0, 
    xdata=0x7ffeb806ef08, par=0x7ffecfffdb40) at afr-common.c:1333
#4  0x00007fff6890f42a in afr_inode_refresh_subvol_with_lookup_cbk (
    frame=<optimized out>, cookie=<optimized out>, this=<optimized out>, 
    op_ret=<optimized out>, op_errno=<optimized out>, inode=<optimized out>, 
    buf=0x7ffecfffdaa0, xdata=0x7ffeb806ef08, par=0x7ffecfffdb40)
    at afr-common.c:1344
#5  0x00007fff68b8e96f in client4_0_lookup_cbk (req=<optimized out>, 
    iov=<optimized out>, count=<optimized out>, myframe=0x7fff483147b8)
    at client-rpc-fops_v2.c:2640
#6  0x00007fffed293115 in rpc_clnt_handle_reply (
    clnt=clnt at entry=0x7fff640671b0, pollin=pollin at entry=0x7ffeb81aa110)
    at rpc-clnt.c:764
#7  0x00007fffed2934b3 in rpc_clnt_notify (trans=0x7fff64067540, 
    mydata=0x7fff640671e0, event=<optimized out>, data=0x7ffeb81aa110)
    at rpc-clnt.c:931
#8  0x00007fffed29007b in rpc_transport_notify (
    this=this at entry=0x7fff64067540, 
    event=event at entry=RPC_TRANSPORT_MSG_RECEIVED, 
    data=data at entry=0x7ffeb81aa110) at rpc-transport.c:545
#9  0x00007fff7b6298dc in socket_event_poll_in_async (xl=<optimized out>, 
    async=0x7ffeb81aa238) at socket.c:2592
#10 0x00007fff7b63134c in gf_async (
    cbk=0x7fff7b6298b0 <socket_event_poll_in_async>, xl=<optimized out>, 
    async=0x7ffeb81aa238) at ../../../../libglusterfs/src/glusterfs/async.h:189
#11 socket_event_poll_in (notify_handled=true, this=0x7fff64067540)
    at socket.c:2633
#12 socket_event_handler (fd=fd at entry=18, idx=idx at entry=9, gen=gen at entry=1, 
--Type <RET> for more, q to quit, c to continue without paging--
    data=data at entry=0x7fff64067540, poll_in=<optimized out>, 
    poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000')
    at socket.c:3031
#13 0x00007fffed54da8b in event_dispatch_epoll_handler (event=0x7ffecfffe014, 
    event_pool=0x5555557cb750) at event-epoll.c:650
#14 event_dispatch_epoll_worker (data=0x7fff64038a80) at event-epoll.c:763
#15 0x00007fffec0702de in start_thread () from /lib64/libpthread.so.0
#16 0x00007fffeb8c8a63 in clone () from /lib64/libc.so.6


(gdb) continue
Continuing.
[Switching to Thread 0x7fff2ffff700 (LWP 50171)]

Thread 11 "glfs_epoll003" hit Breakpoint 1, afr_read_txn_refresh_done (
    frame=0x7ffc64544158, this=0x7fff640137b0, err=5) at afr-read-txn.c:281
281	    if (err) {
(gdb) bt
#0  afr_read_txn_refresh_done (frame=0x7ffc64544158, this=0x7fff640137b0, 
    err=5) at afr-read-txn.c:281
#1  0x00007fff68901fdb in afr_txn_refresh_done (
    frame=frame at entry=0x7ffc64544158, this=this at entry=0x7fff640137b0, err=5, 
    err at entry=0) at afr-common.c:1223
#2  0x00007fff689022b3 in afr_inode_refresh_done (
    frame=frame at entry=0x7ffc64544158, this=this at entry=0x7fff640137b0, error=0)
    at afr-common.c:1295
#3  0x00007fff6890f3fb in afr_inode_refresh_subvol_cbk (frame=0x7ffc64544158, 
    cookie=<optimized out>, this=0x7fff640137b0, op_ret=<optimized out>, 
    op_errno=<optimized out>, buf=buf at entry=0x7fff2fffdaa0, 
    xdata=0x7fff18523298, par=0x7fff2fffdb40) at afr-common.c:1333
#4  0x00007fff6890f42a in afr_inode_refresh_subvol_with_lookup_cbk (
    frame=<optimized out>, cookie=<optimized out>, this=<optimized out>, 
    op_ret=<optimized out>, op_errno=<optimized out>, inode=<optimized out>, 
    buf=0x7fff2fffdaa0, xdata=0x7fff18523298, par=0x7fff2fffdb40)
    at afr-common.c:1344
#5  0x00007fff68b8e96f in client4_0_lookup_cbk (req=<optimized out>, 
    iov=<optimized out>, count=<optimized out>, myframe=0x7ffc64319a08)
    at client-rpc-fops_v2.c:2640
#6  0x00007fffed293115 in rpc_clnt_handle_reply (
    clnt=clnt at entry=0x7fff640671b0, pollin=pollin at entry=0x7fff18685b40)
    at rpc-clnt.c:764
#7  0x00007fffed2934b3 in rpc_clnt_notify (trans=0x7fff64067540, 
    mydata=0x7fff640671e0, event=<optimized out>, data=0x7fff18685b40)
    at rpc-clnt.c:931
#8  0x00007fffed29007b in rpc_transport_notify (
    this=this at entry=0x7fff64067540, 
    event=event at entry=RPC_TRANSPORT_MSG_RECEIVED, 
    data=data at entry=0x7fff18685b40) at rpc-transport.c:545
#9  0x00007fff7b6298dc in socket_event_poll_in_async (xl=<optimized out>, 
    async=0x7fff18685c68) at socket.c:2592
#10 0x00007fff7b63134c in gf_async (
    cbk=0x7fff7b6298b0 <socket_event_poll_in_async>, xl=<optimized out>, 
    async=0x7fff18685c68) at ../../../../libglusterfs/src/glusterfs/async.h:189
#11 socket_event_poll_in (notify_handled=true, this=0x7fff64067540)
    at socket.c:2633
#12 socket_event_handler (fd=fd at entry=18, idx=idx at entry=9, gen=gen at entry=1, 
--Type <RET> for more, q to quit, c to continue without paging--
    data=data at entry=0x7fff64067540, poll_in=<optimized out>, 
    poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000')
    at socket.c:3031
#13 0x00007fffed54da8b in event_dispatch_epoll_handler (event=0x7fff2fffe014, 
    event_pool=0x5555557cb750) at event-epoll.c:650
#14 event_dispatch_epoll_worker (data=0x7fff64038340) at event-epoll.c:763
#15 0x00007fffec0702de in start_thread () from /lib64/libpthread.so.0
#16 0x00007fffeb8c8a63 in clone () from /lib64/libc.so.6


Continuing.
[Switching to Thread 0x7ffd53fff700 (LWP 50191)]

Thread 31 "glfs_epoll016" hit Breakpoint 1, afr_read_txn_refresh_done (
    frame=0x7ffc949cf7c8, this=0x7fff640137b0, err=5) at afr-read-txn.c:281
281	    if (err) {
(gdb) bt
#0  afr_read_txn_refresh_done (frame=0x7ffc949cf7c8, this=0x7fff640137b0, 
    err=5) at afr-read-txn.c:281
#1  0x00007fff68901fdb in afr_txn_refresh_done (
    frame=frame at entry=0x7ffc949cf7c8, this=this at entry=0x7fff640137b0, err=5, 
    err at entry=0) at afr-common.c:1223
#2  0x00007fff689022b3 in afr_inode_refresh_done (
    frame=frame at entry=0x7ffc949cf7c8, this=this at entry=0x7fff640137b0, error=0)
    at afr-common.c:1295
#3  0x00007fff6890f3fb in afr_inode_refresh_subvol_cbk (frame=0x7ffc949cf7c8, 
    cookie=<optimized out>, this=0x7fff640137b0, op_ret=<optimized out>, 
    op_errno=<optimized out>, buf=buf at entry=0x7ffd53ffdaa0, 
    xdata=0x7ffd3c6764f8, par=0x7ffd53ffdb40) at afr-common.c:1333
#4  0x00007fff6890f42a in afr_inode_refresh_subvol_with_lookup_cbk (
    frame=<optimized out>, cookie=<optimized out>, this=<optimized out>, 
    op_ret=<optimized out>, op_errno=<optimized out>, inode=<optimized out>, 
    buf=0x7ffd53ffdaa0, xdata=0x7ffd3c6764f8, par=0x7ffd53ffdb40)
    at afr-common.c:1344
#5  0x00007fff68b8e96f in client4_0_lookup_cbk (req=<optimized out>, 
    iov=<optimized out>, count=<optimized out>, myframe=0x7ffc9498ce38)
    at client-rpc-fops_v2.c:2640
#6  0x00007fffed293115 in rpc_clnt_handle_reply (
    clnt=clnt at entry=0x7fff6406b4a0, pollin=pollin at entry=0x7ffd3c1685d0)
    at rpc-clnt.c:764
#7  0x00007fffed2934b3 in rpc_clnt_notify (trans=0x7fff6406b7b0, 
    mydata=0x7fff6406b4d0, event=<optimized out>, data=0x7ffd3c1685d0)
    at rpc-clnt.c:931
#8  0x00007fffed29007b in rpc_transport_notify (
    this=this at entry=0x7fff6406b7b0, 
    event=event at entry=RPC_TRANSPORT_MSG_RECEIVED, 
    data=data at entry=0x7ffd3c1685d0) at rpc-transport.c:545
#9  0x00007fff7b6298dc in socket_event_poll_in_async (xl=<optimized out>, 
    async=0x7ffd3c1686f8) at socket.c:2592
#10 0x00007fff7b63134c in gf_async (
    cbk=0x7fff7b6298b0 <socket_event_poll_in_async>, xl=<optimized out>, 
    async=0x7ffd3c1686f8) at ../../../../libglusterfs/src/glusterfs/async.h:189
#11 socket_event_poll_in (notify_handled=true, this=0x7fff6406b7b0)
    at socket.c:2633
#12 socket_event_handler (fd=fd at entry=14, idx=idx at entry=6, gen=gen at entry=4, 
--Type <RET> for more, q to quit, c to continue without paging--
    data=data at entry=0x7fff6406b7b0, poll_in=<optimized out>, 
    poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000')
    at socket.c:3031
#13 0x00007fffed54da8b in event_dispatch_epoll_handler (event=0x7ffd53ffe014, 
    event_pool=0x5555557cb750) at event-epoll.c:650
#14 event_dispatch_epoll_worker (data=0x7fff64065ee0) at event-epoll.c:763
#15 0x00007fffec0702de in start_thread () from /lib64/libpthread.so.0
#16 0x00007fffeb8c8a63 in clone () from /lib64/libc.so.6
(gdb) continue
Continuing.
[Switching to Thread 0x7fff60dc6700 (LWP 50169)]

Thread 9 "glfs_epoll001" hit Breakpoint 1, afr_read_txn_refresh_done (
    frame=0x7ffe589b0848, this=0x7fff640137b0, err=5) at afr-read-txn.c:281
281	    if (err) {
(gdb) bt
#0  afr_read_txn_refresh_done (frame=0x7ffe589b0848, this=0x7fff640137b0, 
    err=5) at afr-read-txn.c:281
#1  0x00007fff68901fdb in afr_txn_refresh_done (
    frame=frame at entry=0x7ffe589b0848, this=this at entry=0x7fff640137b0, err=5, 
    err at entry=0) at afr-common.c:1223
#2  0x00007fff689022b3 in afr_inode_refresh_done (
    frame=frame at entry=0x7ffe589b0848, this=this at entry=0x7fff640137b0, error=0)
    at afr-common.c:1295
#3  0x00007fff6890f3fb in afr_inode_refresh_subvol_cbk (frame=0x7ffe589b0848, 
    cookie=<optimized out>, this=0x7fff640137b0, op_ret=<optimized out>, 
    op_errno=<optimized out>, buf=buf at entry=0x7fff60dc4aa0, 
    xdata=0x7fff48a70bc8, par=0x7fff60dc4b40) at afr-common.c:1333
#4  0x00007fff6890f42a in afr_inode_refresh_subvol_with_lookup_cbk (
    frame=<optimized out>, cookie=<optimized out>, this=<optimized out>, 
    op_ret=<optimized out>, op_errno=<optimized out>, inode=<optimized out>, 
    buf=0x7fff60dc4aa0, xdata=0x7fff48a70bc8, par=0x7fff60dc4b40)
    at afr-common.c:1344
#5  0x00007fff68b8e96f in client4_0_lookup_cbk (req=<optimized out>, 
    iov=<optimized out>, count=<optimized out>, myframe=0x7ffe584cfeb8)
    at client-rpc-fops_v2.c:2640
#6  0x00007fffed293115 in rpc_clnt_handle_reply (
    clnt=clnt at entry=0x7fff640671b0, pollin=pollin at entry=0x7fff484cc110)
    at rpc-clnt.c:764
#7  0x00007fffed2934b3 in rpc_clnt_notify (trans=0x7fff64067540, 
    mydata=0x7fff640671e0, event=<optimized out>, data=0x7fff484cc110)
    at rpc-clnt.c:931
#8  0x00007fffed29007b in rpc_transport_notify (
    this=this at entry=0x7fff64067540, 
    event=event at entry=RPC_TRANSPORT_MSG_RECEIVED, 
    data=data at entry=0x7fff484cc110) at rpc-transport.c:545
#9  0x00007fff7b6298dc in socket_event_poll_in_async (xl=<optimized out>, 
    async=0x7fff484cc238) at socket.c:2592
#10 0x00007fff7b63134c in gf_async (
    cbk=0x7fff7b6298b0 <socket_event_poll_in_async>, xl=<optimized out>, 
    async=0x7fff484cc238) at ../../../../libglusterfs/src/glusterfs/async.h:189
#11 socket_event_poll_in (notify_handled=true, this=0x7fff64067540)
    at socket.c:2633
#12 socket_event_handler (fd=fd at entry=18, idx=idx at entry=9, gen=gen at entry=1, 
--Type <RET> for more, q to quit, c to continue without paging--
    data=data at entry=0x7fff64067540, poll_in=<optimized out>, 
    poll_out=<optimized out>, poll_err=0, event_thread_died=0 '\000')
    at socket.c:3031
#13 0x00007fffed54da8b in event_dispatch_epoll_handler (event=0x7fff60dc5014, 
    event_pool=0x5555557cb750) at event-epoll.c:650
#14 event_dispatch_epoll_worker (data=0x555555825900) at event-epoll.c:763
#15 0x00007fffec0702de in start_thread () from /lib64/libpthread.so.0
#16 0x00007fffeb8c8a63 in clone () from /lib64/libc.so.6
(gdb) continue



More information about the Gluster-users mailing list