[Gluster-users] 3 node NFS-Ganesha Cluster

Mon Nov 30 11:08:23 UTC 2015

Hello,
i tried with a 4-node setup but the effect is the same, it takes down
the cluster when one of the node is offline. I thought even in a 3-node
setup when 2 nodes are online and only one is gone the majority of 2
nodes up vs. 1 node down should not result in a lost quorum?
I have created the gluster volume with the following command:
    #> gluster volume create scratch replica 4 transport tcp
    kaukasus:/tank/brick1 altai:/tank/brick1 rnas2:/tank/brick1
    bbk1:/scratch/brick1 force
The following is the log during the takedown of one node (altai):
	
    Nov 30 11:23:43 rnas2 corosync[16869]: [TOTEM ] A new membership
    (129.132.145.5:1120) was formed. Members left: 2
    Nov 30 11:23:43 rnas2 cib[16088]: notice: crm_update_peer_proc: Node
    altai[2] - state is now lost (was member)
    Nov 30 11:23:43 rnas2 cib[16088]: notice: Removing altai/2 from the
    membership list
    Nov 30 11:23:43 rnas2 cib[16088]: notice: Purged 1 peers with id=2
    and/or uname=altai from the membership cache
    Nov 30 11:23:43 rnas2 crmd[16093]: notice: Our peer on the DC
    (altai) is dead
    Nov 30 11:23:43 rnas2 attrd[16091]: notice: crm_update_peer_proc:
    Node altai[2] - state is now lost (was member)
    Nov 30 11:23:43 rnas2 attrd[16091]: notice: Removing all altai
    attributes for attrd_peer_change_cb
    Nov 30 11:23:43 rnas2 crmd[16093]: notice: State transition S_NOT_DC
    -> S_ELECTION [ input=I_ELECTION
    cause=C_CRMD_STATUS_CALLBACK...llback ]
    Nov 30 11:23:43 rnas2 attrd[16091]: notice: Lost attribute writer
    altai
    Nov 30 11:23:43 rnas2 attrd[16091]: notice: Removing altai/2 from
    the membership list
    Nov 30 11:23:43 rnas2 attrd[16091]: notice: Purged 1 peers with id=2
    and/or uname=altai from the membership cache
    Nov 30 11:23:43 rnas2 pacemakerd[16085]: notice:
    crm_update_peer_proc: Node altai[2] - state is now lost (was member)
    Nov 30 11:23:43 rnas2 pacemakerd[16085]: notice: Removing altai/2
    from the membership list
    Nov 30 11:23:43 rnas2 stonith-ng[16089]: notice:
    crm_update_peer_proc: Node altai[2] - state is now lost (was member)
    Nov 30 11:23:43 rnas2 pacemakerd[16085]: notice: Purged 1 peers with
    id=2 and/or uname=altai from the membership cache
    Nov 30 11:23:43 rnas2 stonith-ng[16089]: notice: Removing altai/2
    from the membership list
    Nov 30 11:23:43 rnas2 corosync[16869]: [QUORUM] Members[3]: 1 3 4
    Nov 30 11:23:43 rnas2 crmd[16093]: notice: Node altai[2] - state is
    now lost (was member)
    Nov 30 11:23:43 rnas2 stonith-ng[16089]: notice: Purged 1 peers with
    id=2 and/or uname=altai from the membership cache
    Nov 30 11:23:43 rnas2 corosync[16869]: [MAIN  ] Completed service
    synchronization, ready to provide service.
    Nov 30 11:23:43 rnas2 crmd[16093]: notice: State transition
    S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
    origin=...t_vote ]
    Nov 30 11:23:44 rnas2 crmd[16093]: notice: State transition
    S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE
    origin=do_cl...espond ]
    Nov 30 11:23:44 rnas2 IPaddr(rnas2-cluster_ip-1)[10934]: INFO: IP
    status = ok, IP_CIP=
    Nov 30 11:23:44 rnas2 crmd[16093]: notice: Operation rnas2
    -cluster_ip-1_stop_0: ok (node=rnas2, call=53, rc=0, cib-update=36,
    confirmed=true)
    Nov 30 11:23:44 rnas2 crmd[16093]: notice: Operation nfs
    -grace_stop_0: ok (node=rnas2, call=55, rc=0, cib-update=37,
    confirmed=true)
    Nov 30 11:23:44 rnas2 attrd[16091]: notice: Processing sync-response
    from bbk1
    Nov 30 11:23:45 rnas2 ntpd[1700]: Deleting interface #47 bond0,
    129.132.145.23#123, interface stats: received=0, sent=0,
    dropped=0...783 secs
Nov 30 11:24:24 rnas2 lrmd[16090]: warning: nfs-grace_start_0
process (PID 10947) timed out
Nov 30 11:24:24 rnas2 lrmd[16090]: warning: nfs-grace_start_0:10947 
- timed out after 40000ms
Nov 30 11:24:24 rnas2 crmd[16093]: error: Operation nfs
-grace_start_0: Timed Out (node=rnas2, call=56, timeout=40000ms)
Nov 30 11:24:24 rnas2 crmd[16093]: notice: Operation nfs
-grace_stop_0: ok (node=rnas2, call=57, rc=0, cib-update=39,
confirmed=true)
I discovered, when i restart the pacemaker service on one of the
running nodes, it can successfully take the cluster online again:
root at kaukasus ~# systemctl restart pacemaker
    Nov 30 11:45:36 rnas2 crmd[16093]: notice: Our peer on the DC
    (kaukasus) is dead
    Nov 30 11:45:36 rnas2 crmd[16093]: notice: State transition S_NOT_DC
    -> S_ELECTION [ input=I_ELECTION
    cause=C_CRMD_STATUS_CALLBACK...llback ]
    Nov 30 11:45:36 rnas2 crmd[16093]: notice: State transition
    S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
    origin=...t_vote ]
    Nov 30 11:45:36 rnas2 attrd[16091]: notice: crm_update_peer_proc:
    Node kaukasus[1] - state is now lost (was member)
    Nov 30 11:45:36 rnas2 attrd[16091]: notice: Removing all kaukasus
    attributes for attrd_peer_change_cb
    Nov 30 11:45:36 rnas2 attrd[16091]: notice: Removing kaukasus/1 from
    the membership list
    Nov 30 11:45:36 rnas2 attrd[16091]: notice: Purged 1 peers with id=1
    and/or uname=kaukasus from the membership cache
    Nov 30 11:45:36 rnas2 stonith-ng[16089]: notice:
    crm_update_peer_proc: Node kaukasus[1] - state is now lost (was
    member)
    Nov 30 11:45:36 rnas2 stonith-ng[16089]: notice: Removing kaukasus/1
    from the membership list
    Nov 30 11:45:36 rnas2 stonith-ng[16089]: notice: Purged 1 peers with
    id=1 and/or uname=kaukasus from the membership cache
    Nov 30 11:45:36 rnas2 cib[16088]: notice: crm_update_peer_proc: Node
    kaukasus[1] - state is now lost (was member)
    Nov 30 11:45:36 rnas2 cib[16088]: notice: Removing kaukasus/1 from
    the membership list
    Nov 30 11:45:36 rnas2 cib[16088]: notice: Purged 1 peers with id=1
    and/or uname=kaukasus from the membership cache
    Nov 30 11:45:36 rnas2 pacemakerd[16085]: notice:
    crm_update_peer_proc: Node kaukasus[1] - state is now lost (was
    member)
    Nov 30 11:45:36 rnas2 pacemakerd[16085]: notice: Removing kaukasus/1
    from the membership list
    Nov 30 11:45:36 rnas2 pacemakerd[16085]: notice: Purged 1 peers with
    id=1 and/or uname=kaukasus from the membership cache
    Nov 30 11:45:36 rnas2 pacemakerd[16085]: notice:
    crm_update_peer_proc: Node kaukasus[1] - state is now member (was
    (null))
    Nov 30 11:45:36 rnas2 crmd[16093]: notice: State transition
    S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE
    origin=do_cl...espond ]
    Nov 30 11:45:36 rnas2 stonith-ng[16089]: notice:
    crm_update_peer_proc: Node kaukasus[1] - state is now member (was
    (null))
    Nov 30 11:45:36 rnas2 attrd[16091]: notice: crm_update_peer_proc:
    Node kaukasus[1] - state is now member (was (null))
    Nov 30 11:45:36 rnas2 cib[16088]: notice: crm_update_peer_proc: Node
    kaukasus[1] - state is now member (was (null))
    Nov 30 11:45:46 rnas2 IPaddr(rnas2-cluster_ip-1)[16591]: INFO:
    Adding inet address 129.132.145.23/32 with broadcast address
    129.132.... bond0
    Nov 30 11:45:46 rnas2 IPaddr(rnas2-cluster_ip-1)[16600]: INFO:
    Bringing device bond0 up
    Nov 30 11:45:46 rnas2 IPaddr(rnas2-cluster_ip-1)[16609]: INFO:
    /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource
    -agen...t_used
    Nov 30 11:45:46 rnas2 crmd[16093]: notice: Operation rnas2
    -cluster_ip-1_start_0: ok (node=rnas2, call=58, rc=0, cib-update=44,
    con...ed=true)
Nov 30 11:45:48 rnas2 ntpd[1700]: Listen normally on 48 bond0
129.132.145.23 UDP 123
Nov 30 11:45:48 rnas2 ntpd[1700]: new interface(s) found: waking up
resolver
Yours,
Rigi
On Mon, 2015-11-30 at 15:26 +0530, Soumya Koduri wrote:
> Hi,
> > But are you telling me that in a 3-node cluster,
> > quorum is lost when one of the nodes ip is down?
> 
> yes. Its the limitation with Pacemaker/Corosync. If the nodes 
> participating in cluster cannot communicate with majority of them 
> (quorum is lost), then the cluster is shut down.
> 
> > 
> > However i am setting up a additional node to test a 4-node setup,
> > but
> > even then if i put down one node and nfs-grace_start
> > (/usr/lib/ocf/resource.d/heartbeat/ganesha_grace) did not run
> > properly
> > on the other nodes, could it be that the whole cluster goes down as
> > quorum lost again?
> 
> That's strange. We have tested quite a few times such configuration
> but 
> haven't hit this issue. (CCin Saurabh who has been testing many such 
> configurations).
> 
> Recently we have observed resource agents (nfs-grace_*) timing out 
> sometimes esp when any node is taken down. But that shouldn't cause
> the 
> entire cluster to shutdown.
> Could you check the logs (/var/log/messages, /var/log/pacemaker.log)
> for 
> any error/warning reported when one node is taken down in case of 4
> -node 
> setup.
> 
> Thanks,
> Soumya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151130/713af8d9/attachment.html>