[Gluster-users] [Bugs] Bricks are going offline unable to recover with heal/start force commands

Sanju Rakonde srakonde at redhat.com
Tue Jan 22 08:51:05 UTC 2019


Hi Shaik,

Can you please provide us complete glusterd and cmd_history logs from all
the nodes in the cluster? Also please paste output of the following
commands (from all nodes):
1. gluster --version
2. gluster volume info
3. gluster volume status
4. gluster peer status
5. ps -ax | grep glusterfsd

On Tue, Jan 22, 2019 at 12:47 PM Shaik Salam <shaik.salam at tcs.com> wrote:

> Hi Surya,
>
> It is already customer setup and cant redeploy again.
> Enabled debug for brick level log but nothing writing to it.
> Can you tell me is any other ways to troubleshoot  or logs to look??
>
>
> From:        Shaik Salam/HYD/TCS
> To:        "Amar Tumballi Suryanarayan" <atumball at redhat.com>
> Cc:        "gluster-users at gluster.org List" <gluster-users at gluster.org>
> Date:        01/22/2019 12:06 PM
> Subject:        Re: [Bugs] Bricks are going offline unable to recover
> with heal/start force commands
> ------------------------------
>
>
> Hi Surya,
>
> I have enabled DEBUG mode for brick level. But nothing writing to brick
> log.
>
> gluster volume set vol_3442e86b6d994a14de73f1b8c82cf0b8
> diagnostics.brick-log-level DEBUG
>
> sh-4.2# pwd
> /var/log/glusterfs/bricks
>
> sh-4.2# ls -la |grep brick_e15c12cceae12c8ab7782dd57cf5b6c1
> -rw-------. 1 root root       0 Jan 20 02:46
> var-lib-heketi-mounts-vg_d5f17487744584e3652d3ca943b0b91b-brick_e15c12cceae12c8ab7782dd57cf5b6c1-brick.log
>
> BR
> Salam
>
>
>
>
> From:        "Amar Tumballi Suryanarayan" <atumball at redhat.com>
> To:        "Shaik Salam" <shaik.salam at tcs.com>
> Cc:        "gluster-users at gluster.org List" <gluster-users at gluster.org>
> Date:        01/22/2019 11:38 AM
> Subject:        Re: [Bugs] Bricks are going offline unable to recover
> with heal/start force commands
> ------------------------------
>
>
>
> *"External email. Open with Caution"*
> Hi Shaik,
>
> Can you check what is there in brick logs? They are located in
> /var/log/glusterfs/bricks/*?
>
> Looks like the samba hooks script failed, but that shouldn't matter in
> this use case.
>
> Also, I see that you are trying to setup heketi to provision volumes,
> which means you may be using gluster in container usecases. If you are
> still in 'PoC' phase, can you give *https://github.com/gluster/gcs*
> <https://github.com/gluster/gcs> a try? That makes the deployment and the
> stack little simpler.
>
> -Amar
>
>
>
>
> On Tue, Jan 22, 2019 at 11:29 AM Shaik Salam <*shaik.salam at tcs.com*
> <shaik.salam at tcs.com>> wrote:
> Can anyone respond how to recover bricks apart from heal/start force
> according to below events from logs.
> Please let me know any other logs required.
> Thanks in advance.
>
> BR
> Salam
>
>
>
> From:        Shaik Salam/HYD/TCS
> To:        *bugs at gluster.org* <bugs at gluster.org>,
> *gluster-users at gluster.org* <gluster-users at gluster.org>
> Date:        01/21/2019 10:03 PM
> Subject:        Bricks are going offline unable to recover with
> heal/start force commands
> ------------------------------
>
>
> Hi,
>
> Bricks are in offline and  unable to recover with following commands
>
> gluster volume heal <vol-name>
>
> gluster volume start <vol-name> force
>
> But still bricks are offline.
>
>
> sh-4.2# gluster volume status vol_3442e86b6d994a14de73f1b8c82cf0b8
> Status of volume: vol_3442e86b6d994a14de73f1b8c82cf0b8
> Gluster process                             TCP Port  RDMA Port  Online
>  Pid
>
> ------------------------------------------------------------------------------
> Brick 192.168.3.6:/var/lib/heketi/mounts/vg
> _ca57f326195c243be2380ce4e42a4191/brick_952
> d75fd193c7209c9a81acbc23a3747/brick         49166     0          Y
> 269
> Brick 192.168.3.5:/var/lib/heketi/mounts/vg
> _d5f17487744584e3652d3ca943b0b91b/brick_e15
> c12cceae12c8ab7782dd57cf5b6c1/brick         N/A       N/A        N
> N/A
> Brick 192.168.3.15:/var/lib/heketi/mounts/v
> g_462ea199185376b03e4b0317363bb88c/brick_17
> 36459d19e8aaa1dcb5a87f48747d04/brick        49173     0          Y
> 225
> Self-heal Daemon on localhost               N/A       N/A        Y
> 45826
> Self-heal Daemon on 192.168.3.6             N/A       N/A        Y
> 65196
> Self-heal Daemon on 192.168.3.15            N/A       N/A        Y
> 52915
>
> Task Status of Volume vol_3442e86b6d994a14de73f1b8c82cf0b8
>
> ------------------------------------------------------------------------------
>
>
> We can see following events from when we start forcing volumes
>
> /mgmt/glusterd.so(+0xe2b3a) [0x7fca9e139b3a]
> -->/usr/lib64/glusterfs/4.1.5/xlator/mgmt/glusterd.so(+0xe2605)
> [0x7fca9e139605] -->/lib64/libglusterfs.so.0(runner_log+0x115)
> [0x7fcaa346f0e5] ) 0-management: Ran script:
> /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh
> --volname=vol_3442e86b6d994a14de73f1b8c82cf0b8 --first=no --version=1
> --volume-op=start --gd-workdir=/var/lib/glusterd
> [2019-01-21 08:22:34.555068] E [run.c:241:runner_log]
> (-->/usr/lib64/glusterfs/4.1.5/xlator/mgmt/glusterd.so(+0xe2b3a)
> [0x7fca9e139b3a]
> -->/usr/lib64/glusterfs/4.1.5/xlator/mgmt/glusterd.so(+0xe2563)
> [0x7fca9e139563] -->/lib64/libglusterfs.so.0(runner_log+0x115)
> [0x7fcaa346f0e5] ) 0-management: Failed to execute script:
> /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh
> --volname=vol_3442e86b6d994a14de73f1b8c82cf0b8 --first=no --version=1
> --volume-op=start --gd-workdir=/var/lib/glusterd
> [2019-01-21 08:22:53.389049] I [MSGID: 106499]
> [glusterd-handler.c:4314:__glusterd_handle_status_volume] 0-management:
> Received status volume req for volume vol_3442e86b6d994a14de73f1b8c82cf0b8
> [2019-01-21 08:23:25.346839] I [MSGID: 106487]
> [glusterd-handler.c:1486:__glusterd_handle_cli_list_friends] 0-glusterd:
> Received cli list req
>
>
> We can see following events from when we heal volumes.
>
> [2019-01-21 08:20:07.576070] W [rpc-clnt.c:1753:rpc_clnt_submit]
> 0-glusterfs: error returned while attempting to connect to host:(null),
> port:0
> [2019-01-21 08:20:07.580225] I [cli-rpc-ops.c:9182:gf_cli_heal_volume_cbk]
> 0-cli: Received resp to heal volume
> [2019-01-21 08:20:07.580326] I [input.c:31:cli_batch] 0-: Exiting with: -1
> [2019-01-21 08:22:30.423311] I [cli.c:768:main] 0-cli: Started running
> gluster with version 4.1.5
> [2019-01-21 08:22:30.463648] I [MSGID: 101190]
> [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2019-01-21 08:22:30.463718] I [socket.c:2632:socket_event_handler]
> 0-transport: EPOLLERR - disconnecting now
> [2019-01-21 08:22:30.463859] W [rpc-clnt.c:1753:rpc_clnt_submit]
> 0-glusterfs: error returned while attempting to connect to host:(null),
> port:0
> [2019-01-21 08:22:33.427710] I [socket.c:2632:socket_event_handler]
> 0-transport: EPOLLERR - disconnecting now
> [2019-01-21 08:22:34.581555] I
> [cli-rpc-ops.c:1472:gf_cli_start_volume_cbk] 0-cli: Received resp to start
> volume
> [2019-01-21 08:22:34.581678] I [input.c:31:cli_batch] 0-: Exiting with: 0
> [2019-01-21 08:22:53.345351] I [cli.c:768:main] 0-cli: Started running
> gluster with version 4.1.5
> [2019-01-21 08:22:53.387992] I [MSGID: 101190]
> [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2019-01-21 08:22:53.388059] I [socket.c:2632:socket_event_handler]
> 0-transport: EPOLLERR - disconnecting now
> [2019-01-21 08:22:53.388138] W [rpc-clnt.c:1753:rpc_clnt_submit]
> 0-glusterfs: error returned while attempting to connect to host:(null),
> port:0
> [2019-01-21 08:22:53.394737] I [input.c:31:cli_batch] 0-: Exiting with: 0
> [2019-01-21 08:23:25.304688] I [cli.c:768:main] 0-cli: Started running
> gluster with version 4.1.5
> [2019-01-21 08:23:25.346319] I [MSGID: 101190]
> [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2019-01-21 08:23:25.346389] I [socket.c:2632:socket_event_handler]
> 0-transport: EPOLLERR - disconnecting now
> [2019-01-21 08:23:25.346500] W [rpc-clnt.c:1753:rpc_clnt_submit]
> 0-glusterfs: error returned while attempting to connect to host:(null),
> port:0
>
>
>
> Please let us know steps to recover bricks.
>
>
> BR
> Salam
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
> _______________________________________________
> Bugs mailing list
> *Bugs at gluster.org* <Bugs at gluster.org>
> *https://lists.gluster.org/mailman/listinfo/bugs*
> <https://lists.gluster.org/mailman/listinfo/bugs>
>
>
> --
> Amar Tumballi (amarts)
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Thanks,
Sanju
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190122/7d471858/attachment.html>


More information about the Gluster-users mailing list