[Gluster-users] [3.11.2] Bricks disconnect from gluster with 0-transport: EPOLLERR

Wed Sep 13 19:28:29 UTC 2017

I ran into something like this in 3.10.4 and filed two bugs for it:

https://bugzilla.redhat.com/show_bug.cgi?id=1491059
https://bugzilla.redhat.com/show_bug.cgi?id=1491060

Please see the above bugs for full detail.

In summary, my issue was related to glusterd's pid handling of pid files
when is starts self-heal and bricks. The issues are:

a. brick pid file leaves stale pid and brick fails to start when
glusterd is started. pid files are stored in `/var/lib/glusterd` which
persists across reboots. When glusterd is started (or restarted or
host rebooted) and the pid of any process matching the pid in the
brick pid file, brick fails to start.

b. self-heal-deamon pid file leave stale pid and indiscriminately
kills pid when glusterd is started. pid files are stored in
`/var/lib/glusterd` which persists across reboots. When glusterd is
started (or restarted or host rebooted) the pid of any process
matching the pid in the shd pid file is killed.

due to the nature of these bugs sometimes bricks/shd will start,
sometimes they will not, restart success may be intermittent. This bug
is most likely to occur when services were running with a low pid,
then the host is rebooted since reboots tend to densely group pids in
lower pid numbers. You might also see it if you have high pid churn
due to short lived processes.

In the case of self-heal daemon, you may also see other processes
"randomly" being terminated.

resulting in:

1a. pid file /var/lib/glusterd/glustershd/run/glustershd.pid remains
after shd is stopped
2a. glusterd kills any process number in the stale shd pid file.
1b. brick pid file(s) remain after brick is stopped
2b. glusterd fails to start brick when the pid in a pid file matches
any running process

Workaround:

in our automation, when we stop all gluster processes (reboot,
upgrade, etc.) we ensure all processes are stopped and then cleanup
the pids with:
'find /var/lib/glusterd/ -name '*pid' -delete'

This is not a complete solution, but works in our most critical times.
We may develop something more complete if the bug is not addressed
promptly.

On Sat, Aug 5, 2017 at 11:54 PM, Leonid Isaev <
leonid.isaev at jila.colorado.edu> wrote:

> Hi,
>
>         I have a distributed volume which runs on Fedora 26 systems with
> glusterfs 3.11.2 from gluster.org repos:
> ----------
> [root at taupo ~]# glusterd --version
> glusterfs 3.11.2
>
> gluster> volume info gv2
> Volume Name: gv2
> Type: Distribute
> Volume ID: 6b468f43-3857-4506-917c-7eaaaef9b6ee
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 6
> Transport-type: tcp
> Bricks:
> Brick1: kiwi:/srv/gluster/gv2/brick1/gvol
> Brick2: kiwi:/srv/gluster/gv2/brick2/gvol
> Brick3: taupo:/srv/gluster/gv2/brick1/gvol
> Brick4: fox:/srv/gluster/gv2/brick1/gvol
> Brick5: fox:/srv/gluster/gv2/brick2/gvol
> Brick6: logan:/srv/gluster/gv2/brick1/gvol
> Options Reconfigured:
> performance.readdir-ahead: on
> nfs.disable: on
>
> gluster> volume status gv2
> Status of volume: gv2
> Gluster process                             TCP Port  RDMA Port  Online
> Pid
> ------------------------------------------------------------
> ------------------
> Brick kiwi:/srv/gluster/gv2/brick1/gvol     49152     0          Y
>  1128
> Brick kiwi:/srv/gluster/gv2/brick2/gvol     49153     0          Y
>  1134
> Brick taupo:/srv/gluster/gv2/brick1/gvol    N/A       N/A        N
>  N/A
> Brick fox:/srv/gluster/gv2/brick1/gvol      49152     0          Y
>  1169
> Brick fox:/srv/gluster/gv2/brick2/gvol      49153     0          Y
>  1175
> Brick logan:/srv/gluster/gv2/brick1/gvol    49152     0          Y
>  1003
> ----------
>
> The machine in question is TAUPO which has one brick that refuses to
> connect to
> the cluster. All installations were migrated from glusterfs 3.8.14 on
> Fedora
> 24: I simply rsync'ed /var/lib/glusterd to new systems. On all other
> machines
> glusterd starts fine and all bricks come up. Hence I suspect a race
> condition
> somewhere. The glusterd.log file (attached) shows that the brick connects,
> and
> then suddenly disconnects from the cluster:
> ----------
> [2017-08-06 03:12:38.536409] I [glusterd-utils.c:5468:glusterd_brick_start]
> 0-management: discovered already-running brick /srv/gluster/gv2/brick1/gvol
> [2017-08-06 03:12:38.536414] I [MSGID: 106143] [glusterd-pmap.c:279:pmap_registry_bind]
> 0-pmap: adding brick /srv/gluster/gv2/brick1/gvol on port 49153
> [2017-08-06 03:12:38.536427] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2017-08-06 03:12:38.536500] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
> 0-snapd: setting frame-timeout to 600
> [2017-08-06 03:12:38.536556] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
> 0-snapd: setting frame-timeout to 600
> [2017-08-06 03:12:38.536616] I [MSGID: 106492] [glusterd-handler.c:2717:__glusterd_handle_friend_update]
> 0-glusterd: Received friend update from uuid: d5a487e3-4c9b-4e5a-91ff-
> b8d85fd51da9
> [2017-08-06 03:12:38.584598] I [MSGID: 106502] [glusterd-handler.c:2762:__glusterd_handle_friend_update]
> 0-management: Received my uuid as Friend
> [2017-08-06 03:12:38.599340] I [socket.c:2474:socket_event_handler]
> 0-transport: EPOLLERR - disconnecting now
> [2017-08-06 03:12:38.613745] I [MSGID: 106005] [glusterd-handler.c:5846:__glusterd_brick_rpc_notify]
> 0-management: Brick taupo:/srv/gluster/gv2/brick1/gvol has disconnected
> from glusterd.
> ----------
>
> I checked that cluster.brick-multiplex is off. How can I debug this
> further?
>
> Thanks in advance,
> --
> Leonid Isaev
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170913/541a2891/attachment.html>