[Gluster-users] Split brain after rebooting half of a two-node cluster

Tue Aug 4 23:19:31 UTC 2015

Hello,

We are trying to run a pair of ActiveMQ nodes on top of glusterfs, using the approach described in http://activemq.apache.org/shared-file-system-master-slave.html

This seemed to work at first, but if I start rebooting machines while under load I seem to quickly get into this problem:

  [2015-08-05 08:54:40.475351] I [afr-self-heal-common.c:705:afr_mark_sources] 0-gv0-replicate-0: split-brain possible, no source detected
  [2015-08-05 08:54:40.475373] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 61819: LOOKUP() /kahadb/db.data => -1 (Input/output error)

(from /var/log/glusterfs/srv-amq.log , more of the log below)

Afterwards the whole cluster ceases to function, since the affected file is crucial to ActiveMQ's storage backend.

I have gotten into this situation three times by now, recovering in between by rebuilding the glusterfs configuration from scratch (stop volume, delete, empty bricks, create, start). The trigger is always a "sudo reboot" on one of the nodes.

Am I wrong to expect this to work or is this an issue with my configuration or glusterfs itself?

Cheers,
    Peter

More detail:
-----
qmaster at srvamqpy01:~$ cat /etc/issue
Ubuntu 12.04.5 LTS \n \l

qmaster at srvamqpy01:~$ uname -a
Linux srvamqpy01 3.13.0-61-generic #100~precise1-Ubuntu SMP Wed Jul 29 12:06:40 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
qmaster at srvamqpy01:~$ gluster --version
glusterfs 3.2.5 built on Jan 31 2012 07:39:59
[...]
qmaster at srvamqpy01:~$ cat /etc/fstab
[...]
/dev/sdb1       /data/brick1    ext4    acl,user_xattr  0       2
srvamqpy01:/gv0 /srv/amq        glusterfs       defaults,nobootwait,_netdev,direct-io-mode=disable 0 0
-----

Command used to create the volume:
-----
gluster volume create gv0 replica 2 srvamqpy01:/data/brick1/gv0 srvamqpy02:/data/brick1/gv0
-----

And more of the log:
-----
[2015-08-05 08:51:54.50969] I [rpc-clnt.c:1536:rpc_clnt_reconfig] 0-gv0-client-0: changing port to 24011 (from 0)
[2015-08-05 08:51:54.51313] I [rpc-clnt.c:1536:rpc_clnt_reconfig] 0-gv0-client-1: changing port to 24011 (from 0)
[2015-08-05 08:51:58.32060] I [client-handshake.c:1090:select_server_supported_programs] 0-gv0-client-0: Using Program GlusterFS 3.2.5, Num (1298437), Version (310)
[2015-08-05 08:51:58.32239] I [client-handshake.c:913:client_setvolume_cbk] 0-gv0-client-0: Connected to 10.254.2.137:24011, attached to remote volume '/data/brick1/gv0'.
[2015-08-05 08:51:58.32257] I [afr-common.c:3141:afr_notify] 0-gv0-replicate-0: Subvolume 'gv0-client-0' came back up; going online.
[2015-08-05 08:51:58.32359] I [client-handshake.c:1090:select_server_supported_programs] 0-gv0-client-1: Using Program GlusterFS 3.2.5, Num (1298437), Version (310)
[2015-08-05 08:51:58.33070] I [client-handshake.c:913:client_setvolume_cbk] 0-gv0-client-1: Connected to 10.254.2.164:24011, attached to remote volume '/data/brick1/gv0'.
[2015-08-05 08:51:58.35521] I [fuse-bridge.c:3339:fuse_graph_setup] 0-fuse: switched to graph 0
[2015-08-05 08:51:58.35642] I [fuse-bridge.c:2927:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.22
[2015-08-05 08:51:58.36851] I [afr-common.c:1520:afr_set_root_inode_on_first_lookup] 0-gv0-replicate-0: added root inode
[2015-08-05 08:52:06.24620] I [afr-common.c:1038:afr_launch_self_heal] 0-gv0-replicate-0: background  meta-data data self-heal triggered. path: /kahadb/lock
[2015-08-05 08:52:06.28557] I [afr-self-heal-common.c:2077:afr_self_heal_completion_cbk] 0-gv0-replicate-0: background  meta-data data self-heal completed on /kahadb/lock
[2015-08-05 08:52:16.64428] I [afr-common.c:1038:afr_launch_self_heal] 0-gv0-replicate-0: background  meta-data self-heal triggered. path: /kahadb/lock
[2015-08-05 08:52:16.65701] I [afr-self-heal-common.c:2077:afr_self_heal_completion_cbk] 0-gv0-replicate-0: background  meta-data self-heal completed on /kahadb/lock
[2015-08-05 08:52:21.692657] W [socket.c:1494:__socket_proto_state_machine] 0-gv0-client-1: reading from socket failed. Error (Transport endpoint is not connected), peer (10.254.2.164:24011)
[2015-08-05 08:52:21.693353] I [client.c:1883:client_rpc_notify] 0-gv0-client-1: disconnected
[2015-08-05 08:52:26.71942] W [client3_1-fops.c:4699:client3_1_lk] 0-gv0-client-1: (-1909467425): failed to get fd ctx. EBADFD
[2015-08-05 08:52:26.71988] W [client3_1-fops.c:4751:client3_1_lk] 0-gv0-client-1: failed to send the fop: File descriptor in bad state
[2015-08-05 08:52:32.35552] E [socket.c:1685:socket_connect_finish] 0-gv0-client-1: connection to 10.254.2.164:24011 failed (Connection refused)
[2015-08-05 08:52:35.36179] I [client-handshake.c:1090:select_server_supported_programs] 0-gv0-client-1: Using Program GlusterFS 3.2.5, Num (1298437), Version (310)
[2015-08-05 08:52:35.37641] I [client-handshake.c:913:client_setvolume_cbk] 0-gv0-client-1: Connected to 10.254.2.164:24011, attached to remote volume '/data/brick1/gv0'.
[2015-08-05 08:52:36.538807] I [afr-open.c:432:afr_openfd_sh] 0-gv0-replicate-0:  data missing-entry gfid self-heal triggered. path: /kahadb/db-4.log, reason: Replicate up down flush, data lock is held
[2015-08-05 08:52:36.539349] I [afr-self-heal-common.c:1203:sh_missing_entries_create] 0-gv0-replicate-0: no missing files - /kahadb/db-4.log. proceeding to metadata check
[2015-08-05 08:52:36.540105] W [dict.c:418:dict_unref] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x7fea25a93ec5] (-->/usr/lib/glusterfs/3.2.5/xlator/protocol/client.so(client3_1_fstat_cbk+0x312) [0x7fea228f8902] (-->/usr/lib/glusterfs/3.2.5/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x1d5) [0x7fea226a0405]))) 0-dict: dict is NULL
[2015-08-05 08:52:36.772749] I [afr-self-heal-algorithm.c:520:sh_diff_loop_driver_done] 0-gv0-replicate-0: diff self-heal on /kahadb/db-4.log: completed. (1 blocks of 252 were different (0.40%))
[2015-08-05 08:52:36.775638] I [afr-self-heal-common.c:2077:afr_self_heal_completion_cbk] 0-gv0-replicate-0: background  data missing-entry gfid self-heal completed on /kahadb/db-4.log
[2015-08-05 08:52:36.785113] I [afr-open.c:432:afr_openfd_sh] 0-gv0-replicate-0:  data missing-entry gfid self-heal triggered. path: /kahadb/db.redo, reason: Replicate up down flush, data lock is held
[2015-08-05 08:52:36.785214] I [afr-open.c:432:afr_openfd_sh] 0-gv0-replicate-0:  data missing-entry gfid self-heal triggered. path: /kahadb/db.data, reason: Replicate up down flush, data lock is held
[2015-08-05 08:52:36.785458] I [afr-self-heal-common.c:1858:afr_sh_post_nb_entrylk_conflicting_sh_cbk] 0-gv0-replicate-0: Non blocking entrylks failed.
[2015-08-05 08:52:36.785480] I [afr-self-heal-common.c:963:afr_sh_missing_entries_done] 0-gv0-replicate-0: split brain found, aborting selfheal of /kahadb/db.data
[2015-08-05 08:52:36.785496] E [afr-self-heal-common.c:2074:afr_self_heal_completion_cbk] 0-gv0-replicate-0: background  data missing-entry gfid self-heal failed on /kahadb/db.data
[2015-08-05 08:52:36.786139] I [afr-self-heal-common.c:1203:sh_missing_entries_create] 0-gv0-replicate-0: no missing files - /kahadb/db.redo. proceeding to metadata check
[2015-08-05 08:52:36.787147] I [afr-self-heal-common.c:2077:afr_self_heal_completion_cbk] 0-gv0-replicate-0: background  data missing-entry gfid self-heal completed on /kahadb/db.redo
[2015-08-05 08:52:56.948495] I [afr-common.c:1038:afr_launch_self_heal] 0-gv0-replicate-0: background  entry self-heal triggered. path: /kahadb
[2015-08-05 08:52:56.949790] I [afr-self-heal-entry.c:644:afr_sh_entry_expunge_entry_cbk] 0-gv0-replicate-0: missing entry /kahadb/db.free on gv0-client-0
[2015-08-05 08:52:56.952400] E [afr-self-heal-common.c:1054:afr_sh_common_lookup_resp_handler] 0-gv0-replicate-0: path /kahadb/lock on subvolume gv0-client-1 => -1 (No such file or directory)
[2015-08-05 08:52:56.953281] I [afr-self-heal-common.c:2077:afr_self_heal_completion_cbk] 0-gv0-replicate-0: background  entry self-heal completed on /kahadb
[2015-08-05 08:53:37.196481] I [client3_1-fops.c:1025:client3_1_removexattr_cbk] 0-gv0-client-0: remote operation failed: No data available
[2015-08-05 08:53:37.196735] I [client3_1-fops.c:1025:client3_1_removexattr_cbk] 0-gv0-client-1: remote operation failed: No data available
[2015-08-05 08:53:37.196917] W [fuse-bridge.c:850:fuse_err_cbk] 0-glusterfs-fuse: 54284: REMOVEXATTR() /kahadb/db-4.log => -1 (No data available)
[2015-08-05 08:53:37.200487] I [client3_1-fops.c:1025:client3_1_removexattr_cbk] 0-gv0-client-0: remote operation failed: No data available
[2015-08-05 08:53:37.200746] I [client3_1-fops.c:1025:client3_1_removexattr_cbk] 0-gv0-client-1: remote operation failed: No data available
[2015-08-05 08:53:37.200936] W [fuse-bridge.c:850:fuse_err_cbk] 0-glusterfs-fuse: 54291: REMOVEXATTR() /kahadb/db-5.log => -1 (No data available)
[2015-08-05 08:53:48.674314] W [client3_1-fops.c:3655:client3_1_flush] 0-gv0-client-1: (-2161116166): failed to get fd ctx. EBADFD
[2015-08-05 08:53:48.674350] W [client3_1-fops.c:3692:client3_1_flush] 0-gv0-client-1: failed to send the fop: File descriptor in bad state
[2015-08-05 08:53:48.676375] W [client3_1-fops.c:3655:client3_1_flush] 0-gv0-client-1: (-1443019630): failed to get fd ctx. EBADFD
[2015-08-05 08:53:48.676396] W [client3_1-fops.c:3692:client3_1_flush] 0-gv0-client-1: failed to send the fop: File descriptor in bad state
[2015-08-05 08:53:48.762598] W [client3_1-fops.c:4699:client3_1_lk] 0-gv0-client-1: (-1909467425): failed to get fd ctx. EBADFD
[2015-08-05 08:53:48.762662] W [client3_1-fops.c:4751:client3_1_lk] 0-gv0-client-1: failed to send the fop: File descriptor in bad state
[2015-08-05 08:53:48.764122] W [client3_1-fops.c:3655:client3_1_flush] 0-gv0-client-1: (-1909467425): failed to get fd ctx. EBADFD
[2015-08-05 08:53:48.764142] W [client3_1-fops.c:3692:client3_1_flush] 0-gv0-client-1: failed to send the fop: File descriptor in bad state
[2015-08-05 08:54:40.467613] I [afr-self-heal-common.c:705:afr_mark_sources] 0-gv0-replicate-0: split-brain possible, no source detected
[2015-08-05 08:54:40.467839] I [afr-self-heal-common.c:705:afr_mark_sources] 0-gv0-replicate-0: split-brain possible, no source detected
[2015-08-05 08:54:40.467861] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 61809: LOOKUP() /kahadb/db.data => -1 (Input/output error)
[2015-08-05 08:54:40.468151] I [afr-self-heal-common.c:705:afr_mark_sources] 0-gv0-replicate-0: split-brain possible, no source detected
[2015-08-05 08:54:40.468171] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 61811: LOOKUP() /kahadb/db.data => -1 (Input/output error)
[2015-08-05 08:54:40.473764] I [afr-self-heal-common.c:705:afr_mark_sources] 0-gv0-replicate-0: split-brain possible, no source detected
[2015-08-05 08:54:40.473797] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 61812: LOOKUP() /kahadb/db.data => -1 (Input/output error)
[2015-08-05 08:54:40.475351] I [afr-self-heal-common.c:705:afr_mark_sources] 0-gv0-replicate-0: split-brain possible, no source detected
[2015-08-05 08:54:40.475373] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 61819: LOOKUP() /kahadb/db.data => -1 (Input/output error)
-----

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150804/0770948e/attachment.html>