[Gluster-users] Bricks are going offline unable to recover with heal/start force commands

Mon Jan 21 16:33:24 UTC 2019

Hi,

Bricks are in offline and  unable to recover with following commands

gluster volume heal <vol-name>

gluster volume start <vol-name> force

But still bricks are offline.

sh-4.2# gluster volume status vol_3442e86b6d994a14de73f1b8c82cf0b8
Status of volume: vol_3442e86b6d994a14de73f1b8c82cf0b8
Gluster process                             TCP Port  RDMA Port  Online 
Pid
------------------------------------------------------------------------------
Brick 192.168.3.6:/var/lib/heketi/mounts/vg
_ca57f326195c243be2380ce4e42a4191/brick_952
d75fd193c7209c9a81acbc23a3747/brick         49166     0          Y 269
Brick 192.168.3.5:/var/lib/heketi/mounts/vg
_d5f17487744584e3652d3ca943b0b91b/brick_e15
c12cceae12c8ab7782dd57cf5b6c1/brick         N/A       N/A        N N/A
Brick 192.168.3.15:/var/lib/heketi/mounts/v
g_462ea199185376b03e4b0317363bb88c/brick_17
36459d19e8aaa1dcb5a87f48747d04/brick        49173     0          Y 225
Self-heal Daemon on localhost               N/A       N/A        Y 45826
Self-heal Daemon on 192.168.3.6             N/A       N/A        Y 65196
Self-heal Daemon on 192.168.3.15            N/A       N/A        Y 52915

Task Status of Volume vol_3442e86b6d994a14de73f1b8c82cf0b8
------------------------------------------------------------------------------

We can see following events from when we start forcing volumes

/mgmt/glusterd.so(+0xe2b3a) [0x7fca9e139b3a] 
-->/usr/lib64/glusterfs/4.1.5/xlator/mgmt/glusterd.so(+0xe2605) 
[0x7fca9e139605] -->/lib64/libglusterfs.so.0(runner_log+0x115) 
[0x7fcaa346f0e5] ) 0-management: Ran script: 
/var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh 
--volname=vol_3442e86b6d994a14de73f1b8c82cf0b8 --first=no --version=1 
--volume-op=start --gd-workdir=/var/lib/glusterd
[2019-01-21 08:22:34.555068] E [run.c:241:runner_log] 
(-->/usr/lib64/glusterfs/4.1.5/xlator/mgmt/glusterd.so(+0xe2b3a) 
[0x7fca9e139b3a] 
-->/usr/lib64/glusterfs/4.1.5/xlator/mgmt/glusterd.so(+0xe2563) 
[0x7fca9e139563] -->/lib64/libglusterfs.so.0(runner_log+0x115) 
[0x7fcaa346f0e5] ) 0-management: Failed to execute script: 
/var/lib/glusterd/hooks/1/start/post/S30samba-start.sh 
--volname=vol_3442e86b6d994a14de73f1b8c82cf0b8 --first=no --version=1 
--volume-op=start --gd-workdir=/var/lib/glusterd
[2019-01-21 08:22:53.389049] I [MSGID: 106499] 
[glusterd-handler.c:4314:__glusterd_handle_status_volume] 0-management: 
Received status volume req for volume vol_3442e86b6d994a14de73f1b8c82cf0b8
[2019-01-21 08:23:25.346839] I [MSGID: 106487] 
[glusterd-handler.c:1486:__glusterd_handle_cli_list_friends] 0-glusterd: 
Received cli list req

We can see following events from when we heal volumes.

[2019-01-21 08:20:07.576070] W [rpc-clnt.c:1753:rpc_clnt_submit] 
0-glusterfs: error returned while attempting to connect to host:(null), 
port:0
[2019-01-21 08:20:07.580225] I [cli-rpc-ops.c:9182:gf_cli_heal_volume_cbk] 
0-cli: Received resp to heal volume
[2019-01-21 08:20:07.580326] I [input.c:31:cli_batch] 0-: Exiting with: -1
[2019-01-21 08:22:30.423311] I [cli.c:768:main] 0-cli: Started running 
gluster with version 4.1.5
[2019-01-21 08:22:30.463648] I [MSGID: 101190] 
[event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread 
with index 1
[2019-01-21 08:22:30.463718] I [socket.c:2632:socket_event_handler] 
0-transport: EPOLLERR - disconnecting now
[2019-01-21 08:22:30.463859] W [rpc-clnt.c:1753:rpc_clnt_submit] 
0-glusterfs: error returned while attempting to connect to host:(null), 
port:0
[2019-01-21 08:22:33.427710] I [socket.c:2632:socket_event_handler] 
0-transport: EPOLLERR - disconnecting now
[2019-01-21 08:22:34.581555] I 
[cli-rpc-ops.c:1472:gf_cli_start_volume_cbk] 0-cli: Received resp to start 
volume
[2019-01-21 08:22:34.581678] I [input.c:31:cli_batch] 0-: Exiting with: 0
[2019-01-21 08:22:53.345351] I [cli.c:768:main] 0-cli: Started running 
gluster with version 4.1.5
[2019-01-21 08:22:53.387992] I [MSGID: 101190] 
[event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread 
with index 1
[2019-01-21 08:22:53.388059] I [socket.c:2632:socket_event_handler] 
0-transport: EPOLLERR - disconnecting now
[2019-01-21 08:22:53.388138] W [rpc-clnt.c:1753:rpc_clnt_submit] 
0-glusterfs: error returned while attempting to connect to host:(null), 
port:0
[2019-01-21 08:22:53.394737] I [input.c:31:cli_batch] 0-: Exiting with: 0
[2019-01-21 08:23:25.304688] I [cli.c:768:main] 0-cli: Started running 
gluster with version 4.1.5
[2019-01-21 08:23:25.346319] I [MSGID: 101190] 
[event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread 
with index 1
[2019-01-21 08:23:25.346389] I [socket.c:2632:socket_event_handler] 
0-transport: EPOLLERR - disconnecting now
[2019-01-21 08:23:25.346500] W [rpc-clnt.c:1753:rpc_clnt_submit] 
0-glusterfs: error returned while attempting to connect to host:(null), 
port:0

Please let us know steps to recover bricks.

BR
Salam
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190121/e78d4c82/attachment.html>