[Gluster-users] Help with degraded volume recovery

Mon Jul 27 18:28:45 UTC 2015

Hello all, I’m having a problem with one of my Gluster volumes and would appreciate some help.  My setup is an 8-node cluster set up as 4x2 replication, with 20TB per node for 88TB total. OS is CentOS 7.1, there is 1 20 TB brick per node on its own XFS partition, separate from the OS. A few days ago one node in the cluster had a problem. The XFS mount went AWOL and as a result Gluster wasn’t able to communicate with it. The mount was completely frozen, we couldn't even do an un-mount. we shutdown Gluster as politely as possible with a “systemctl shutdown glusterd”, then proceeded to kill all the child processes that were left behind. We rebooted the server, restarted Gluster and it everything appeared ok. The node then proceeded to start a fix-layout which we've become accustomed to after a node has a problem. “Gluster volume status” and heal-counts looked ok, but after some digging I found the brick process on the partner node had crashed. I did a “systemctl restart glusterd”, it restarted the brick process and the performance problems got better. The next day stability continued to degrade and after much digging I found it too was suffering from a hung storage mount. We rebooted the server and things improved for a bit however as Gluster attempted to heal itself performance degraded to the point of being unusable. After scanning the log files I found the host that is server our SAMBA access to this volume the logs were full of “All sub-volumes are down. Going off-line…”,. “gfid or missing entry self heal failed”. I suspect there are split-brain files however the “info split-brain” does not show them.
For now I’ve had to take the 2nd node off-line in order to kill the self-heal and make the cluster usable again as this is a customer-facing system. My plan is to try and rsync files from the on-line node to the off-line to so we have file redundancy. Given the possibility of split brain I’ll be specifying the --update flag. (Full command will be rsync -X -a -uW -v --min-size=1). My question is when Gluster restarts on the node that is down, is it going to be able to reconcile and heal the files I’ve moved into the brick directly? Alternatively is there a way I can bring the node back on-line but stop the self-heal? I’ve seen references to “gluster volume set <volname> cluster.ensure-durability off” but have never used it before. If I set that off will new files being written to the cluster still get replicated between the nodes?
We do plan on investigation why a local XFS filesystem can become unresponsive like this but have been dealing with trying to stabilize and recover this volume. If anyone happens to have come across this before please let us know. We are worried it will happen again before we get the volume fully healed and will be in even worse shape.

Below is the volume log on SAMBA host node. Name of the volume is gv-05. This is from server gl-023, another node in the cluster that is also the Samba host. Samba running via the VFS plugin and managed via CTDB for redundancy.
[2015-07-23 20:08:33.949602] E [rpc-clnt.c:369:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x48) [0x7f21f8cae168] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb8) [0x7f21f8cac218] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f21f8cac13e]))) 0-gv-05-client-3: forced unwinding frame type(GlusterFS 3.3) op(RELEASEDIR(42)) called at 2015-07-23 20:08:33.949041 (xid=0x54)[2015-07-23 20:08:33.949623] E [rpc-clnt.c:369:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x48) [0x7f21f8cae168] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb8) [0x7f21f8cac218] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f21f8cac13e]))) 0-gv-05-client-3: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2015-07-23 20:08:33.949054 (xid=0x55)[2015-07-23 20:08:33.949636] E [afr-common.c:4168:afr_notify] 0-gv-05-replicate-1: All subvolumes are down. Going offline until atleast one of them comes back up.[2015-07-23 20:08:33.949807] E [afr-common.c:4168:afr_notify] 0-gv-05-replicate-1: All subvolumes are down. Going offline until atleast one of them comes back up.[2015-07-23 20:08:33.951295] E [afr-common.c:4168:afr_notify] 0-gv-05-replicate-2: All subvolumes are down. Going offline until atleast one of them comes back up.[2015-07-23 20:08:33.952550] E [afr-common.c:4168:afr_notify] 0-gv-05-replicate-3: All subvolumes are down. Going offline until atleast one of them comes back up.[2015-07-23 20:08:33.953270] E [afr-common.c:4168:afr_notify] 0-gv-05-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.[2015-07-23 20:08:33.953323] E [afr-common.c:4168:afr_notify] 0-gv-05-replicate-2: All subvolumes are down. Going offline until atleast one of them comes back up.[2015-07-23 20:08:33.954639] E [rpc-clnt.c:369:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x48) [0x7f21f8cae168] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb8) [0x7f21f8cac218] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f21f8cac13e]))) 0-gv-05-client-3: forced unwinding frame type(GlusterFS 3.3) op(FSTAT(25)) called at 2015-07-23 20:08:33.954573 (xid=0x710)Also in the log file: [2015-07-20 14:40:46.192052] E [afr-open.c:269:afr_openfd_fix_open_cbk] 0-gv-05-replicate-3: Failed to open /scan_process_005j/release_2598/156966B54819A27E/1569FD186C415498.nbx on subvolume gv-05-client-7[2015-07-20 14:40:46.192484] E [afr-open.c:269:afr_openfd_fix_open_cbk] 0-gv-05-replicate-3: Failed to open /scan_process_005j/release_2598/156966B54819A27E/1569FD186C415498.nbx on subvolume gv-05-client-7[2015-07-20 14:40:46.193464] E [afr-open.c:269:afr_openfd_fix_open_cbk] 0-gv-05-replicate-3: Failed to open /scan_process_005j/release_2598/156966B54819A27E/1569FD186C415498.nbx on subvolume gv-05-client-7[2015-07-20 14:40:46.195432] E [afr-open.c:269:afr_openfd_fix_open_cbk] 0-gv-05-replicate-3: Failed to open /scan_process_005j/release_2598/156966B54819A27E/1569FD186C415498.nbx on subvolume gv-05-client-7[2015-07-20 14:40:46.195844] E [afr-open.c:269:afr_openfd_fix_open_cbk] 0-gv-05-replicate-3: Failed to open /scan_process_005j/release_2598/156966B54819A27E/1569FD186C415498.nbx on subvolume gv-05-client-7[2015-07-20 14:40:46.336722] E [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-gv-05-replicate-3:  gfid or missing entry self heal  failed,   on /scan_process_005j/release_2598/156966B54819A27E/1569FD186C415498.nbx[2015-07-20 14:40:52.083727] E [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-gv-05-replicate-3:  gfid or missing entry self heal  failed,   on /scan_process_005j/release_2598/156966B54819A27E/1569FD1872EA5510.nbx[2015-07-20 14:40:53.793992] E [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-gv-05-replicate-3:  gfid or missing entry self heal  failed,   on /scan_process_005j/release_2598/156966B54819A27E/1569FD1872EA5510.nbx[2015-07-20 14:40:53.804607] E [afr-open.c:269:afr_openfd_fix_open_cbk] 0-gv-05-replicate-3: Failed to open /scan_process_005j/release_2598/156966B54819A27E/1569FD1872EA5510.nbx on subvolume gv-05-client-7
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150727/c60ad736/attachment.html>