[Gluster-users] Failed Volume

Fri May 26 13:22:41 UTC 2017

Recently, I had some problems with the OS hard drives in my glusterd servers and took one of my systems down for maintenance. The first step was to remove one of the bricks (brick1) hosted on the server (fs001). The data migrated successfully and completed last night. After that, I went to commit the changes and the commit failed. Afterwards, glusterd will not start on one of my servers (fs003). When I check the glusterd logs on fs003 I get the following errors whenever glusterd starts:

[2017-05-26 04:37:21.358932] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 (args: /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid)
[2017-05-26 04:37:21.382630] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536
[2017-05-26 04:37:21.382712] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /var/lib/glusterd as working directory
[2017-05-26 04:37:21.422858] I [MSGID: 106228] [glusterd.c:433:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system [No such file or directory]
[2017-05-26 04:37:21.450123] I [MSGID: 106513] [glusterd-store.c:2047:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30706
[2017-05-26 04:37:21.463812] E [MSGID: 101032] [store.c:434:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/vols/hpcscratch/bricks/cri16fs001-ib:-data-brick1-scratch. [No such file or directory]
[2017-05-26 04:37:21.463866] E [MSGID: 106201] [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: Unable to restore volume: hpcscratch
[2017-05-26 04:37:21.463919] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2017-05-26 04:37:21.463943] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed
[2017-05-26 04:37:21.463970] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed
[2017-05-26 04:37:21.466703] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xda) [0x405cba] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x116) [0x405b96] -->/usr/sbin/glusterd(cleanup_and_exit+0x65) [0x4059d5] ) 0-: received signum (0), shutting down

The volume is distribution only. The problem to me looks like it is still expecting brick1 on fs001 to be available in the volume. Is there any way to recover from this? Is there any more information that I can provide?

--
Mike Jarsulic