[Gluster-devel] glusterfsd start twice leading to glustershd can not connect with glusterfsd

Sat Apr 28 08:23:24 UTC 2018

Hi, Atin and experts

>>> That would be a hack. I think we need to first figure out on a successful brick connect why do we receive a connect, followed by a disconnect event and then another connect event. That's where the problem exists. I'm waiting to hear for inputs from our RPC experts on this.

I can answer the above questions now and suppose the root cause of this issue is clear now.

1)     Why receive a connect, followed by a disconnect?

In current implement, the connect,  function glusterd_rpc_create called in glusterd_brick_connect not specify option "transport.socket. ignore-enoent ".

So in connect function is “socket_connect” in socket.c  will use connect instead of connect_loop, in most case it will return ret with -1 and errno=ENOENT, and it will treat as SUCCESS!(attach the code below)

And epoll mechanism will introduce a EPOLLHUP event with a  not really connected socket! So it always trigger a EPOLLHUP handler which will disconnect the socket, and set the start_triggerred flag to faluse for the brick.

This disconnect issue is 100% reproduced, You could check the glusterd.log to confirm.

--------------------------------------------------------------------------------------------------------

                if (ign_enoent) {

                        ret = connect_loop (priv->sock,

                                            SA (&this->peerinfo.sockaddr),

                                            this->peerinfo.sockaddr_len);

                } else {

                        ret = connect (priv->sock,

                                       SA (&this->peerinfo.sockaddr),

                                       this->peerinfo.sockaddr_len);

                }

……

if (ret == -1 && ((errno != EINPROGRESS) && (errno != ENOENT))) {

     …………….

                }

                else {

                        /* reset connect_failed so that any previous attempts

                        state is not carried forward */

                        priv->connect_failed = 0;

                        ret = 0;

                }

--------------------------------------------------------------------------------------------------------

2)     Why glusterfsd start twice ?

On cluster SN, the brick started with the event ACC event which triggered by friend storage node, it will finally called in function glusterd_friend_sm as the following code

--------------------------------------------------------------------------------

                synclock_unlock (&priv->big_lock);

                glusterd_launch_synctask (glusterd_spawn_daemons, NULL);

                synclock_lock (&priv->big_lock);

                glusterd_do_quorum_action ();
---------------------------------------------------------------------------
From the above code, the brick start maybe trigger twice, one is glusterd_spawn_saemons, another is glusterd_do_quorum_action.
And it is protected by the flag “brickinfo->start_triggered”, which used to prevent one brick started twice.

BUT, when the disconnect issue is always be there as analyzed as the above, it will be occasionally destroy the protect mechanism,
which is to say, before the second “brick start” begin, the first “brick start” on same brick has EPOLLHUP event arrived, and the flag “brickinfo->start_triggered” has change to false , then the second “brick start” will treat it as a fresh brick process, which we has find in the log of “glusterd.log”, abstract one of example as the below:

       Line 632: [2018-04-15 18:06:00.277399] I [glusterd-utils.c:5928:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/log/brick
       Line 661: [2018-04-15 18:06:00.764239] I [glusterd-utils.c:5928:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/log/brick

And when this occasional issue happen, the glustershd service will abnormal finally.

So I suppose it is a defect for current implement, and a suggest change solution is one code change in function “socket_connect” in socket.c as the below:

/******
               ign_enoent = dict_get_str_boolean (this->options,
                   "transport.socket.ignore-enoent", _gf_false);
Change to ****/
               ign_enoent = dict_get_str_boolean (this->options,
                   "transport.socket.ignore-enoent", _gf_true);

After test, the disconnect issue will not happen anymore!

@all expert from gluster,  could you please review my investigation as the above, and share your comments?
Also, maybe my change will lead to other issue, so could you please give the right solutions?

Thanks & Best Regards
George

From: Atin Mukherjee [mailto:amukherj at redhat.com]
Sent: Thursday, December 14, 2017 12:37 PM
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Cc: Milind Changire <mchangir at redhat.com<mailto:mchangir at redhat.com>>; Gaurav Yadav <gyadav at redhat.com<mailto:gyadav at redhat.com>>; Mohit Agrawal <moagrawa at redhat.com<mailto:moagrawa at redhat.com>>; Raghavendra Gowdappa <rgowdapp at redhat.com<mailto:rgowdapp at redhat.com>>; Samikshan Bairagya <sbairagy at redhat.com<mailto:sbairagy at redhat.com>>
Subject: Re: glusterfsd start twice leading to glustershd can not connect with glusterfsd

That would be a hack. I think we need to first figure out on a successful brick connect why do we receive a connect, followed by a disconnect event and then another connect event. That's where the problem exists. I'm waiting to hear for inputs from our RPC experts on this.

On Thu, Dec 14, 2017 at 8:20 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
Hi Mukherjee,
I will try to add some sleep in glusterd_volume_start_glusterfs, before it execute glusterd_brick_connect, to see if this could help to avoid the “unexpected disconnect event”, I think if this works this race condition could be avoided.

Best regards,
Cynthia （周琳）
MBB SM HETRAN SW3 MATRIX
Storage
Mobile: +86 (0)18657188311

From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: Wednesday, December 13, 2017 4:18 PM
To: 'Atin Mukherjee' <amukherj at redhat.com<mailto:amukherj at redhat.com>>; Milind Changire <mchangir at redhat.com<mailto:mchangir at redhat.com>>
Cc: Gaurav Yadav <gyadav at redhat.com<mailto:gyadav at redhat.com>>; Mohit Agrawal <moagrawa at redhat.com<mailto:moagrawa at redhat.com>>; Raghavendra Gowdappa <rgowdapp at redhat.com<mailto:rgowdapp at redhat.com>>; Samikshan Bairagya <sbairagy at redhat.com<mailto:sbairagy at redhat.com>>
Subject: RE: glusterfsd start twice leading to glustershd can not connect with glusterfsd

@ Mukherjee
When I check the code , I find glusterd_volume_start_glusterfs will call glusterd_brick_connect, is it possible that the glusterd process try to connect to glusterfsd process too early so that it will receive disconnect event?

Best regards,
Cynthia （周琳）
MBB SM HETRAN SW3 MATRIX
Storage
Mobile: +86 (0)18657188311

From: Atin Mukherjee [mailto:amukherj at redhat.com<mailto:amukherj at redhat.com>]
Sent: Tuesday, December 12, 2017 9:00 PM
To: Milind Changire <mchangir at redhat.com<mailto:mchangir at redhat.com>>
Cc: Gaurav Yadav <gyadav at redhat.com<mailto:gyadav at redhat.com>>; Mohit Agrawal <moagrawa at redhat.com<mailto:moagrawa at redhat.com>>; Raghavendra Gowdappa <rgowdapp at redhat.com<mailto:rgowdapp at redhat.com>>; Samikshan Bairagya <sbairagy at redhat.com<mailto:sbairagy at redhat.com>>; Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Subject: Re: glusterfsd start twice leading to glustershd can not connect with glusterfsd

On Tue, 12 Dec 2017 at 17:43, Milind Changire <mchangir at redhat.com<mailto:mchangir at redhat.com>> wrote:
@Cynthia:
What is the command(s) used to restart gluster processes ?

pkill gluster; glusterd is what I did to simulate this.

On Tue, Dec 12, 2017 at 3:25 PM, Atin Mukherjee <amukherj at redhat.com<mailto:amukherj at redhat.com>> wrote:
Looks like we have a race here. I can see a brick disconnect (RPC_CLNT_DISCONNECT) event when the brick process was coming up.
[2017-12-09 18:07:34.488767] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-12-09 18:07:34.489020] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/export/brick has disconnected from glusterd.
[2017-12-09 18:07:34.489450] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-12-09 18:07:34.489717] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/log/brick has disconnected from glusterd.
[2017-12-09 18:07:34.490114] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-12-09 18:07:34.490398] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/mstate/brick has disconnected from glusterd.
[2017-12-09 18:07:34.490785] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now

What I can see is every time when I restart all the gluster process (simulating a reboot scenario) I could see respective brick disconnect events being received at glusterd.

#0  __glusterd_brick_rpc_notify (rpc=rpc at entry=0x7fffe8008810, mydata=mydata at entry=0x7fffe800e0d0,
    event=event at entry=RPC_CLNT_DISCONNECT, data=data at entry=0x0) at glusterd-handler.c:6007
#1  0x00007ffff2936169 in glusterd_big_locked_notify (rpc=0x7fffe8008810, mydata=0x7fffe800e0d0, event=RPC_CLNT_DISCONNECT,
    data=0x0, notify_fn=0x7ffff2932440 <__glusterd_brick_rpc_notify>) at glusterd-handler.c:68
#2  0x00007ffff78c0343 in rpc_clnt_handle_disconnect (conn=0x7fffe8008840, clnt=0x7fffe8008810) at rpc-clnt.c:876
#3  rpc_clnt_notify (trans=0x7fffe8008ae0, mydata=0x7fffe8008840, event=RPC_TRANSPORT_DISCONNECT, data=<optimized out>)
    at rpc-clnt.c:939
#4  0x00007ffff78bca73 in rpc_transport_notify (this=this at entry=0x7fffe8008ae0, event=event at entry=RPC_TRANSPORT_DISCONNECT,
    data=data at entry=0x7fffe8008ae0) at rpc-transport.c:545
#5  0x00007fffefc252bf in socket_event_poll_err (idx=<optimized out>, gen=<optimized out>, this=0x7fffe8008ae0) at socket.c:1210
#6  socket_event_handler (fd=9, idx=<optimized out>, gen=<optimized out>, data=0x7fffe8008ae0, poll_in=0, poll_out=<optimized out>,
    poll_err=16) at socket.c:2488
#7  0x00007ffff7b5097c in event_dispatch_epoll_handler (event=0x7fffee9dce84, event_pool=0x64f040) at event-epoll.c:583
#8  event_dispatch_epoll_worker (data=0x70b820) at event-epoll.c:659
#9  0x00007ffff693f36d in start_thread () from /lib64/libpthread.so.0
#10 0x00007ffff61ecbbf in clone () from /lib64/libc.so.6
The question what I have for RPC team is - why do we see a disconnect event in this case?  Any pointers, Mohit/Raghu/Milind?
The side effect of this disconnect event is what caused this issue. In __glusterd_brick_rpc_notify we set the brickinfo->start_triggered to false. So when two glusterd_brick_start threads were racing with each other, in between we got a disconnect because of which this flag got reset and we ended up trying to spawn the same brick twice.

On Tue, Dec 12, 2017 at 11:05 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
Ok. When this happened.

gluster volume info mstate
Volume Name: mstate
Type: Replicate
Volume ID: 5fc29728-b5de-4b5f-8f3c-9654acda2577
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: sn-0.local:/mnt/bricks/mstate/brick
Brick2: sn-1.local:/mnt/bricks/mstate/brick
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
cluster.server-quorum-type: none
cluster.consistent-metadata: on
server.allow-insecure: on
network.ping-timeout: 42
performance.client-io-threads: off
cluster.server-quorum-ratio: 51%
[root at sn-0:/root]

Also the ps result:

ps aux | grep gluster
root       380  0.0  0.6 377892 13760 ?        Ss   20:07   0:00 python /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root       409  0.0  0.4  82964  9148 ?        S    20:07   0:00 python /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root       811  0.5  1.0 528720 21636 ?        Ssl  20:07   0:01 /usr/sbin/glusterd --volfile=/opt/nokia/libexec/StorageUtils/etc/glusterd/glusterd.vol -p /run/glusterd.pid
root      1088  0.1  0.6 1036648 13996 ?       Ssl  20:07   0:00 /usr/sbin/glusterfsd -s sn-0.local --volfile-id export.sn-0.local.mnt-bricks-export-brick -p /var/run/gluster/vols/export/sn-0.local-mnt-bricks-export-brick.pid -S /var/run/gluster/44f098f14fe45b74b61c5a1d98993d15.socket --brick-name /mnt/bricks/export/brick -l /var/log/glusterfs/bricks/mnt-bricks-export-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49152 --xlator-option export-server.listen-port=49152 --xlator-option transport.socket.bind-address=sn-0.local
root      1153  1.8  0.7 1234036 14664 ?       Ssl  20:07   0:06 /usr/sbin/glusterfsd -s sn-0.local --volfile-id log.sn-0.local.mnt-bricks-log-brick -p /var/run/gluster/vols/log/sn-0.local-mnt-bricks-log-brick.pid -S /var/run/gluster/0af76d9541921cccad34e5fd3041a44d.socket --brick-name /mnt/bricks/log/brick -l /var/log/glusterfs/bricks/mnt-bricks-log-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49156 --xlator-option log-server.listen-port=49156 --xlator-option transport.socket.bind-address=sn-0.local
root      1155  0.0  0.5 772996 10580 ?        Ssl  20:07   0:00 /usr/sbin/glusterfsd -s sn-0.local --volfile-id mstate.sn-0.local.mnt-bricks-mstate-brick -p /var/run/gluster/vols/mstate/sn-0.local-mnt-bricks-mstate-brick.pid -S /var/run/gluster/79d00394b1085adcf7dc1fd4c57f71bc.socket --brick-name /mnt/bricks/mstate/brick -l /var/log/glusterfs/bricks/mnt-bricks-mstate-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49154 --xlator-option mstate-server.listen-port=49154 --xlator-option transport.socket.bind-address=sn-0.local
root      1167  0.4  0.7 1235512 15488 ?       Ssl  20:07   0:01 /usr/sbin/glusterfsd -s sn-0.local --volfile-id services.sn-0.local.mnt-bricks-services-brick -p /var/run/gluster/vols/services/sn-0.local-mnt-bricks-services-brick.pid -S /var/run/gluster/fc16d9fe6fd0695555a5592343da7ba3.socket --brick-name /mnt/bricks/services/brick -l /var/log/glusterfs/bricks/mnt-bricks-services-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49155 --xlator-option services-server.listen-port=49155 --xlator-option transport.socket.bind-address=sn-0.local
root      1188  0.0  0.5 644916 11896 ?        Ssl  20:07   0:00 /usr/sbin/glusterfs -s sn-0.local --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/c405d28a3a79d88436a826d75cf7153a.socket --xlator-option *replicate*.node-uuid=f830d809-c935-451f-853a-1364f76a4160
root      2324  0.0  0.6 629784 13932 ?        Ssl  20:07   0:00 /usr/sbin/glusterfs --acl --volfile-server=sn-0.local --volfile-server=sn-1.local --volfile-id=/services /mnt/services
root      2422  0.0  0.5 431036 10604 ?        Ssl  20:07   0:00 /usr/sbin/glusterfs --acl --volfile-server=sn-0.local --volfile-server=sn-1.local --volfile-id=/export /mnt/export
root      2520  0.0  0.5 498712 11024 ?        Ssl  20:07   0:00 /usr/sbin/glusterfs --acl --volfile-server=sn-0.local --volfile-server=sn-1.local --volfile-id=/log /mnt/log
root      2618  0.0  0.5 431036 10476 ?        Ssl  20:07   0:00 /usr/sbin/glusterfs --acl --volfile-server=sn-0.local --volfile-server=sn-1.local --volfile-id=/mstate /mnt/mstate
root      8136  0.0  0.0   8224   784 pts/0    S+   20:13   0:00 grep gluster
[root at sn-0:/root]
#

Best regards,
Cynthia （周琳）
MBB SM HETRAN SW3 MATRIX
Storage
Mobile: +86 (0)18657188311

From: Atin Mukherjee [mailto:amukherj at redhat.com<mailto:amukherj at redhat.com>]
Sent: Tuesday, December 12, 2017 1:32 PM
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Cc: Gaurav Yadav <gyadav at redhat.com<mailto:gyadav at redhat.com>>; Samikshan Bairagya <sbairagy at redhat.com<mailto:sbairagy at redhat.com>>
Subject: Re: glusterfsd start twice leading to glustershd can not connect with glusterfsd

I was looking for output of gluster volume info to understand what all volume tunables were turned on.

On Tue, Dec 12, 2017 at 10:56 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
Hi,
From our test case when this issue reproduced:
gluster volume status mstate
Status of volume: mstate
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sn-0.local:/mnt/bricks/mstate/brick   49157     0          Y       1155
Brick sn-1.local:/mnt/bricks/mstate/brick   49154     0          Y       1201
Self-heal Daemon on localhost               N/A       N/A        Y       1188
Self-heal Daemon on sn-2.local              N/A       N/A        Y       27109
Self-heal Daemon on sn-1.local              N/A       N/A        Y       4836

Task Status of Volume mstate
------------------------------------------------------------------------------
There are no active volume tasks
[root at sn-0:/root]
#

Best regards,
Cynthia （周琳）
MBB SM HETRAN SW3 MATRIX
Storage
Mobile: +86 (0)18657188311

From: Atin Mukherjee [mailto:amukherj at redhat.com<mailto:amukherj at redhat.com>]
Sent: Tuesday, December 12, 2017 12:02 PM
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Cc: Gaurav Yadav <gyadav at redhat.com<mailto:gyadav at redhat.com>>; Samikshan Bairagya <sbairagy at redhat.com<mailto:sbairagy at redhat.com>>
Subject: Re: glusterfsd start twice leading to glustershd can not connect with glusterfsd

On Tue, Dec 12, 2017 at 8:34 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
By the way , when I check the code I find that in
glusterd_op_perform_add_bricks and glusterd_start_volume, glusterd_brick_start is not triggered within pthread_mutex_lock, do you think we could add pthread_mutex_lock?

I don't think that's needed as these functions are triggered through CLI command. We get into this race when the brick start is triggered asynchronously from a restart glusterd code path. Do you have the volume info output when this issue occurred?

Best regards,
Cynthia （周琳）
MBB SM HETRAN SW3 MATRIX
Storage
Mobile: +86 (0)18657188311

From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: Tuesday, December 12, 2017 9:23 AM
To: 'Gaurav Yadav' <gyadav at redhat.com<mailto:gyadav at redhat.com>>; Atin Mukherjee <amukherj at redhat.com<mailto:amukherj at redhat.com>>
Cc: Samikshan Bairagya <sbairagy at redhat.com<mailto:sbairagy at redhat.com>>
Subject: RE: glusterfsd start twice leading to glustershd can not connect with glusterfsd

HI,
Thanks for responding!
My testing version has include the correction of bug-1506589, I’ve also checked my gluster source code. so this is a new issue.
Unfortunately this issue rarely appear, It only appeared once until now. it is found after reboot sn node.
Enclosed is the content in /var/lib/glusterd and /var/log/glusterfs

The time when this issue happened is , only mstate volume has this issue.

[2017-12-09 18:07:02.743046] W [glusterfsd.c:1375:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7454) [0x7f5df4db6454] -->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xdc) [0x40a50c] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x77) [0x4089ec] ) 0-: received signum (15), shutting down
[2017-12-09 18:07:34.416169] I [MSGID: 100030] [glusterfsd.c:2524:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.3 (args: /usr/sbin/glusterfsd -s sn-0.local --volfile-id mstate.sn-0.local.mnt-bricks-mstate-brick -p /var/run/gluster/vols/mstate/sn-0.local-mnt-bricks-mstate-brick.pid -S /var/run/gluster/79d00394b1085adcf7dc1fd4c57f71bc.socket --brick-name /mnt/bricks/mstate/brick -l /var/log/glusterfs/bricks/mnt-bricks-mstate-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49154 --xlator-option mstate-server.listen-port=49154 --xlator-option transport.socket.bind-address=sn-0.local)
[2017-12-09 18:07:34.511598] I [MSGID: 100030] [glusterfsd.c:2524:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.3 (args: /usr/sbin/glusterfsd -s sn-0.local --volfile-id mstate.sn-0.local.mnt-bricks-mstate-brick -p /var/run/gluster/vols/mstate/sn-0.local-mnt-bricks-mstate-brick.pid -S /var/run/gluster/79d00394b1085adcf7dc1fd4c57f71bc.socket --brick-name /mnt/bricks/mstate/brick -l /var/log/glusterfs/bricks/mnt-bricks-mstate-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49157 --xlator-option mstate-server.listen-port=49157 --xlator-option transport.socket.bind-address=sn-0.local)
Best regards,
Cynthia （周琳）
MBB SM HETRAN SW3 MATRIX
Storage
Mobile: +86 (0)18657188311

From: Gaurav Yadav [mailto:gyadav at redhat.com]
Sent: Monday, December 11, 2017 6:58 PM
To: Atin Mukherjee <amukherj at redhat.com<mailto:amukherj at redhat.com>>
Cc: Samikshan Bairagya <sbairagy at redhat.com<mailto:sbairagy at redhat.com>>; Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Subject: Re: glusterfsd start twice leading to glustershd can not connect with glusterfsd

After seeing logs which you have shared here, it seems it's a port mismatch issue.
We had a similar issue bug-1506589,<https://bugzilla.redhat.com/show_bug.cgi?id=1506589> "Please confirm that you too are seeing same behavior".

If not please share setup details along with..
* files from /var/lib/glusterd/ directory,
* log files from /var/log/glusterfs

Thanks
Gaurav

On Mon, Dec 11, 2017 at 3:16 PM, Atin Mukherjee <amukherj at redhat.com<mailto:amukherj at redhat.com>> wrote:
Adding couple of my colleagues to the thread since I will be busy through out this week. If this is easily reproducible then could you share the exact steps?

On Mon, 11 Dec 2017 at 14:56, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
Hi,
        Good day!
        I see your change I292f1e58d6971e111725e1baea1fe98b890b43e2 in glusterfs code.
        However from my test glusterfs 3.12.3 (already contain that modification) still has something wrong,

gluster volume status mstate
Status of volume: mstate
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick sn-0.local:/mnt/bricks/mstate/brick   49157     0          Y       1155
Brick sn-1.local:/mnt/bricks/mstate/brick   49154     0          Y       1201
Self-heal Daemon on localhost               N/A       N/A        Y       1188
Self-heal Daemon on sn-2.local              N/A       N/A        Y       27109
Self-heal Daemon on sn-1.local              N/A       N/A        Y       4836

Task Status of Volume mstate
------------------------------------------------------------------------------
There are no active volume tasks

But ps shows
ps aux | grep gluster
root       380  0.0  0.6 377892 13760 ?        Ss   20:07   0:00 python /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root       409  0.0  0.4  82964  9148 ?        S    20:07   0:00 python /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root       811  0.5  1.0 528720 21636 ?        Ssl  20:07   0:01 /usr/sbin/glusterd --volfile=/opt/nokia/libexec/StorageUtils/etc/glusterd/glusterd.vol -p /run/glusterd.pid
root      1088  0.1  0.6 1036648 13996 ?       Ssl  20:07   0:00 /usr/sbin/glusterfsd -s sn-0.local --volfile-id export.sn-0.local.mnt-bricks-export-brick -p /var/run/gluster/vols/export/sn-0.local-mnt-bricks-export-brick.pid -S /var/run/gluster/44f098f14fe45b74b61c5a1d98993d15.socket --brick-name /mnt/bricks/export/brick -l /var/log/glusterfs/bricks/mnt-bricks-export-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49152 --xlator-option export-server.listen-port=49152 --xlator-option transport.socket.bind-address=sn-0.local
root      1153  1.8  0.7 1234036 14664 ?       Ssl  20:07   0:06 /usr/sbin/glusterfsd -s sn-0.local --volfile-id log.sn-0.local.mnt-bricks-log-brick -p /var/run/gluster/vols/log/sn-0.local-mnt-bricks-log-brick.pid -S /var/run/gluster/0af76d9541921cccad34e5fd3041a44d.socket --brick-name /mnt/bricks/log/brick -l /var/log/glusterfs/bricks/mnt-bricks-log-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49156 --xlator-option log-server.listen-port=49156 --xlator-option transport.socket.bind-address=sn-0.local
root      1155  0.0  0.5 772996 10580 ?        Ssl  20:07   0:00 /usr/sbin/glusterfsd -s sn-0.local --volfile-id mstate.sn-0.local.mnt-bricks-mstate-brick -p /var/run/gluster/vols/mstate/sn-0.local-mnt-bricks-mstate-brick.pid -S /var/run/gluster/79d00394b1085adcf7dc1fd4c57f71bc.socket --brick-name /mnt/bricks/mstate/brick -l /var/log/glusterfs/bricks/mnt-bricks-mstate-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49154 --xlator-option mstate-server.listen-port=49154 --xlator-option transport.socket.bind-address=sn-0.local
root      1167  0.4  0.7 1235512 15488 ?       Ssl  20:07   0:01 /usr/sbin/glusterfsd -s sn-0.local --volfile-id services.sn-0.local.mnt-bricks-services-brick -p /var/run/gluster/vols/services/sn-0.local-mnt-bricks-services-brick.pid -S /var/run/gluster/fc16d9fe6fd0695555a5592343da7ba3.socket --brick-name /mnt/bricks/services/brick -l /var/log/glusterfs/bricks/mnt-bricks-services-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49155 --xlator-option services-server.listen-port=49155 --xlator-option transport.socket.bind-address=sn-0.local
root      1188  0.0  0.5 644916 11896 ?        Ssl  20:07   0:00 /usr/sbin/glusterfs -s sn-0.local --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/c405d28a3a79d88436a826d75cf7153a.socket --xlator-option *replicate*.node-uuid=f830d809-c935-451f-853a-1364f76a4160
root      2324  0.0  0.6 629784 13932 ?        Ssl  20:07   0:00 /usr/sbin/glusterfs --acl --volfile-server=sn-0.local --volfile-server=sn-1.local --volfile-id=/services /mnt/services
root      2422  0.0  0.5 431036 10604 ?        Ssl  20:07   0:00 /usr/sbin/glusterfs --acl --volfile-server=sn-0.local --volfile-server=sn-1.local --volfile-id=/export /mnt/export
root      2520  0.0  0.5 498712 11024 ?        Ssl  20:07   0:00 /usr/sbin/glusterfs --acl --volfile-server=sn-0.local --volfile-server=sn-1.local --volfile-id=/log /mnt/log
root      2618  0.0  0.5 431036 10476 ?        Ssl  20:07   0:00 /usr/sbin/glusterfs --acl --volfile-server=sn-0.local --volfile-server=sn-1.local --volfile-id=/mstate /mnt/mstate
root      8136  0.0  0.0   8224   784 pts/0    S+   20:13   0:00 grep gluster
[root at sn-0:/root]

So glustershd can not connect with mstate brick on sn-0:

gluster volume heal mstate info
Brick sn-0.local:/mnt/bricks/mstate/brick
Status: Transport endpoint is not connected
Number of entries: -

Brick sn-1.local:/mnt/bricks/mstate/brick
Status: Connected
Number of entries: 0

From sn-0 glusterd log I find:
[2017-12-09 18:07:34.366851] I [glusterd-utils.c:5926:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/export/brick
[2017-12-09 18:07:34.368484] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-12-09 18:07:34.406488] I [glusterd-utils.c:5926:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/log/brick
[2017-12-09 18:07:34.410384] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-12-09 18:07:34.410530] I [glusterd-utils.c:5926:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/mstate/brick
[2017-12-09 18:07:34.413819] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-12-09 18:07:34.413926] C [MSGID: 106003] [glusterd-server-quorum.c:352:glusterd_do_volume_quorum_action] 0-management: Server quorum regained for volume services. Starting local bricks.
[2017-12-09 18:07:34.414031] I [glusterd-utils.c:5926:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/services/brick
[2017-12-09 18:07:34.417573] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-12-09 18:07:34.462163] I [MSGID: 106492] [glusterd-handler.c:2718:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: e4ea0550-873b-410c-a29e-3901c4737cd8
[2017-12-09 18:07:34.470228] I [MSGID: 106502] [glusterd-handler.c:2763:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2017-12-09 18:07:34.470432] I [MSGID: 106492] [glusterd-handler.c:2718:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: feddd74c-c768-4157-b152-31d5dd488dcc
[2017-12-09 18:07:34.470493] I [MSGID: 106502] [glusterd-handler.c:2763:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2017-12-09 18:07:34.479540] I [MSGID: 106163] [glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 31202
[2017-12-09 18:07:34.488767] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-12-09 18:07:34.489020] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/export/brick has disconnected from glusterd.
[2017-12-09 18:07:34.489450] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-12-09 18:07:34.489717] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/log/brick has disconnected from glusterd.
[2017-12-09 18:07:34.490114] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-12-09 18:07:34.490398] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/mstate/brick has disconnected from glusterd.
[2017-12-09 18:07:34.490785] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-12-09 18:07:34.491049] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/services/brick has disconnected from glusterd.
[2017-12-09 18:07:34.491514] I [MSGID: 106493] [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: feddd74c-c768-4157-b152-31d5dd488dcc, host: sn-1.local, port: 0
[2017-12-09 18:07:34.501902] I [glusterd-utils.c:5926:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/log/brick
[2017-12-09 18:07:34.503376] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-12-09 18:07:34.503502] I [glusterd-utils.c:5926:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/mstate/brick
[2017-12-09 18:07:34.506655] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-12-09 18:07:34.507887] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-12-09 18:07:34.508188] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/log/brick has disconnected from glusterd.
[2017-12-09 18:07:34.509479] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
……
[2017-12-09 18:07:36.023087] I [glusterd-utils.c:5834:glusterd_brick_start] 0-management: discovered already-running brick /mnt/bricks/export/brick
[2017-12-09 18:07:36.023106] I [MSGID: 106143] [glusterd-pmap.c:280:pmap_registry_bind] 0-pmap: adding brick /mnt/bricks/export/brick on port 49152
[2017-12-09 18:07:36.060419] I [glusterd-utils.c:5834:glusterd_brick_start] 0-management: discovered already-running brick /mnt/bricks/log/brick
[2017-12-09 18:07:36.060442] I [MSGID: 106143] [glusterd-pmap.c:280:pmap_registry_bind] 0-pmap: adding brick /mnt/bricks/log/brick on port 49156
[2017-12-09 18:07:36.094987] I [glusterd-utils.c:5834:glusterd_brick_start] 0-management: discovered already-running brick /mnt/bricks/mstate/brick
[2017-12-09 18:07:36.095008] I [MSGID: 106143] [glusterd-pmap.c:280:pmap_registry_bind] 0-pmap: adding brick /mnt/bricks/mstate/brick on port 49157
[2017-12-09 18:07:36.129059] I [glusterd-utils.c:5834:glusterd_brick_start] 0-management: discovered already-running brick /mnt/bricks/services/brick
[2017-12-09 18:07:36.129108] I [MSGID: 106143] [glusterd-pmap.c:280:pmap_registry_bind] 0-pmap: adding brick /mnt/bricks/services/brick on port 49155
[2017-12-09 18:07:36.159408] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600

……
[2017-12-09 18:07:36.919192] I [MSGID: 106493] [glusterd-rpc-ops.c:701:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: e4ea0550-873b-410c-a29e-3901c4737cd8
[2017-12-09 18:07:36.920269] I [MSGID: 106492] [glusterd-handler.c:2718:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: e4ea0550-873b-410c-a29e-3901c4737cd8
[2017-12-09 18:07:36.927413] I [MSGID: 106502] [glusterd-handler.c:2763:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2017-12-09 18:07:36.949762] I [MSGID: 106493] [glusterd-rpc-ops.c:701:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: feddd74c-c768-4157-b152-31d5dd488dcc
[2017-12-09 18:07:36.949817] I [MSGID: 106492] [glusterd-handler.c:2718:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: feddd74c-c768-4157-b152-31d5dd488dcc
[2017-12-09 18:07:36.949882] I [MSGID: 106502] [glusterd-handler.c:2763:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2017-12-09 18:07:37.768326] W [MSGID: 101095] [xlator.c:213:xlator_dynload] 0-xlator: /usr/lib64/glusterfs/3.12.3/xlator/nfs/server.so: cannot open shared object file: No such file or directory
[2017-12-09 18:07:37.783811] I [MSGID: 106143] [glusterd-pmap.c:280:pmap_registry_bind] 0-pmap: adding brick /mnt/bricks/services/brick on port 49155
[2017-12-09 18:07:37.783901] I [MSGID: 106143] [glusterd-pmap.c:280:pmap_registry_bind] 0-pmap: adding brick /mnt/bricks/mstate/brick on port 49154
[2017-12-09 18:07:37.783966] I [MSGID: 106143] [glusterd-pmap.c:280:pmap_registry_bind] 0-pmap: adding brick /mnt/bricks/log/brick on port 49156
[2017-12-09 18:07:37.784032] I [MSGID: 106143] [glusterd-pmap.c:280:pmap_registry_bind] 0-pmap: adding brick /mnt/bricks/export/brick on port 49152
[2017-12-09 18:07:37.810305] W [MSGID: 101095] [xlator.c:162:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/3.12.3/xlator/nfs/server.so: cannot open shared object file: No such file or directory

And the mstate brick process on sn-0 shows:

[2017-12-09 18:07:02.743046] W [glusterfsd.c:1375:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7454) [0x7f5df4db6454] -->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xdc) [0x40a50c] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x77) [0x4089ec] ) 0-: received signum (15), shutting down
[2017-12-09 18:07:34.416169] I [MSGID: 100030] [glusterfsd.c:2524:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.3 (args: /usr/sbin/glusterfsd -s sn-0.local --volfile-id mstate.sn-0.local.mnt-bricks-mstate-brick -p /var/run/gluster/vols/mstate/sn-0.local-mnt-bricks-mstate-brick.pid -S /var/run/gluster/79d00394b1085adcf7dc1fd4c57f71bc.socket --brick-name /mnt/bricks/mstate/brick -l /var/log/glusterfs/bricks/mnt-bricks-mstate-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49154 --xlator-option mstate-server.listen-port=49154 --xlator-option transport.socket.bind-address=sn-0.local)
[2017-12-09 18:07:34.511598] I [MSGID: 100030] [glusterfsd.c:2524:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.3 (args: /usr/sbin/glusterfsd -s sn-0.local --volfile-id mstate.sn-0.local.mnt-bricks-mstate-brick -p /var/run/gluster/vols/mstate/sn-0.local-mnt-bricks-mstate-brick.pid -S /var/run/gluster/79d00394b1085adcf7dc1fd4c57f71bc.socket --brick-name /mnt/bricks/mstate/brick -l /var/log/glusterfs/bricks/mnt-bricks-mstate-brick.log --xlator-option *-posix.glusterd-uuid=f830d809-c935-451f-853a-1364f76a4160 --brick-port 49157 --xlator-option mstate-server.listen-port=49157 --xlator-option transport.socket.bind-address=sn-0.local)
[2017-12-09 18:07:34.542268] E [MSGID: 100018] [glusterfsd.c:2109:glusterfs_pidfile_update] 0-glusterfsd: pidfile /var/run/gluster/vols/mstate/sn-0.local-mnt-bricks-mstate-brick.pid lock failed [Resource temporarily unavailable]
[2017-12-09 18:07:34.549242] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2017-12-09 18:07:34.550119] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-12-09 18:07:37.690037] I [rpcsvc.c:2322:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64
[2017-12-09 18:07:37.690125] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-mstate-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction

……

Final graph:
+------------------------------------------------------------------------------+
  1: volume mstate-posix
  2:     type storage/posix
  3:     option glusterd-uuid f830d809-c935-451f-853a-1364f76a4160
  4:     option directory /mnt/bricks/mstate/brick
  5:     option volume-id 5fc29728-b5de-4b5f-8f3c-9654acda2577
  6:     option shared-brick-count 1
  7: end-volume
  8:
  9: volume mstate-trash
10:     type features/trash
11:     option trash-dir .trashcan
12:     option brick-path /mnt/bricks/mstate/brick
13:     option trash-internal-op off
14:     subvolumes mstate-posix
15: end-volume
16:
17: volume mstate-changetimerecorder
18:     type features/changetimerecorder
19:     option db-type sqlite3
20:     option hot-brick off
21:     option db-name brick.db
22:     option db-path /mnt/bricks/mstate/brick/.glusterfs/
23:     option record-exit off
24:     option ctr_link_consistency off
25:     option ctr_lookupheal_link_timeout 300
26:     option ctr_lookupheal_inode_timeout 300
27:     option record-entry on
28:     option ctr-enabled off
29:     option record-counters off
30:     option ctr-record-metadata-heat off
31:     option sql-db-cachesize 12500
32:     option sql-db-wal-autocheckpoint 25000
33:     subvolumes mstate-trash
34: end-volume
35:
36: volume mstate-changelog
37:     type features/changelog
38:     option changelog-brick /mnt/bricks/mstate/brick
39:     option changelog-dir /mnt/bricks/mstate/brick/.glusterfs/changelogs
40:     option changelog-barrier-timeout 120
41:     subvolumes mstate-changetimerecorder
42: end-volume
43:
44: volume mstate-bitrot-stub
45:     type features/bitrot-stub
46:     option export /mnt/bricks/mstate/brick
47:     option bitrot disable
48:     subvolumes mstate-changelog
49: end-volume
50:
51: volume mstate-access-control
52:     type features/access-control
53:     subvolumes mstate-bitrot-stub
54: end-volume
55:
56: volume mstate-locks
57:     type features/locks
58:     subvolumes mstate-access-control
59: end-volume
60:
61: volume mstate-worm
62:     type features/worm
63:     option worm off
64:     option worm-file-level off
65:     subvolumes mstate-locks
66: end-volume
67:
68: volume mstate-read-only
69:     type features/read-only
70:     option read-only off
71:     subvolumes mstate-worm
72: end-volume
73:
74: volume mstate-leases
75:     type features/leases
76:     option leases off
77:     subvolumes mstate-read-only
78: end-volume
79:
80: volume mstate-upcall
81:     type features/upcall
82:     option cache-invalidation off
83:     subvolumes mstate-leases
84: end-volume
85:
86: volume mstate-io-threads
87:     type performance/io-threads
88:     subvolumes mstate-upcall
89: end-volume
90:
91: volume mstate-selinux
92:     type features/selinux
93:     option selinux on
94:     subvolumes mstate-io-threads
95: end-volume<
--
- Atin (atinm)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20180428/425721b9/attachment-0001.html>