[Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

Greg Scott GregScott at infrasupport.com
Wed Jul 10 22:04:54 UTC 2013


And here is ps ax | grep gluster from both nodes when fw1 is offline.  Note I have it mounted right now with the 'backupvolfile-server=<secondary server> mount option.  The ps ax | grep gluster output looks the same now as it did when both nodes were online.

>From fw1:
[root at chicago-fw1 gregs]# ./ruletest.sh
[root at chicago-fw1 gregs]#
[root at chicago-fw1 gregs]#
[root at chicago-fw1 gregs]# ps ax | grep gluster
1019 ?        Ssl    0:09 /usr/sbin/glusterd -p /run/glusterd.pid
1274 ?        Ssl    0:32 /usr/sbin/glusterfsd -s 192.168.253.1 --volfile-id firewall-scripts.192.168.253.1.gluster-fw1 -p /var/lib/glusterd/vols/firewall-scripts/run/192.168.253.1-gluster-fw1.pid -S /var/run/3eea976403bb07230cae75b885406920.socket --brick-name /gluster-fw1 -l /var/log/glusterfs/bricks/gluster-fw1.log --xlator-option *-posix.glusterd-uuid=e13d53de-c7ed-4e63-bcb1-dc69ae25cc15 --brick-port 49152 --xlator-option firewall-scripts-server.listen-port=49152
1280 ?        Ssl    0:05 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/ec00b40c3ed179eccfdd89f5fcd540cc.socket
1285 ?        Ssl    0:05 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/fa9d586a6fab73a52bba6fc92ddd5d91.socket --xlator-option *replicate*.node-uuid=e13d53de-c7ed-4e63-bcb1-dc69ae25cc15
12649 ?        Ssl    0:00 /usr/sbin/glusterfs --volfile-id=/firewall-scripts --volfile-server=192.168.253.1 /firewall-scripts
12991 pts/1    S+     0:00 grep --color=auto gluster
[root at chicago-fw1 gregs]#
[root at chicago-fw1 gregs]# more ruletest.sh
iptables -I INPUT 1 -i enp5s4 -s 192.168.253.2 -j REJECT
[root at chicago-fw1 gregs]#

You can see from above fw1 is now offline.  Here are the gluster processes still on fw2 - they look the same as before.

[root at chicago-fw2 gregs]# ps ax | grep gluster
1027 ?        Ssl    0:11 /usr/sbin/glusterd -p /run/glusterd.pid
1291 ?        Ssl    0:14 /usr/sbin/glusterfsd -s 192.168.253.2 --volfile-id firewall-scripts.192.168.253.2.gluster-fw2 -p /var/lib/glusterd/vols/firewall-scripts/run/192.168.253.2-gluster-fw2.pid -S /var/run/380dca5c55990acea8ab30f5a08375a7.socket --brick-name /gluster-fw2 -l /var/log/glusterfs/bricks/gluster-fw2.log --xlator-option *-posix.glusterd-uuid=a2334360-d1d3-40c1-8c0e-7d62a5318899 --brick-port 49152 --xlator-option firewall-scripts-server.listen-port=49152
1306 ?        Ssl    0:06 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/12903cdbca94bee4abfc3b4df24e2e61.socket
1310 ?        Ssl    0:06 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/a2dee45b1271f43ae8a8d9003567b428.socket --xlator-option *replicate*.node-uuid=a2334360-d1d3-40c1-8c0e-7d62a5318899
12663 ?        Ssl    0:01 /usr/sbin/glusterfs --volfile-id=/firewall-scripts --volfile-server=192.168.253.2 /firewall-scripts
13008 pts/0    S+     0:00 grep --color=auto gluster
[root at chicago-fw2 gregs]#


-          Greg

From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Greg Scott
Sent: Wednesday, July 10, 2013 4:57 PM
To: 'raghav'; gluster-users at gluster.org List
Subject: Re: [Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

*         It looks like the brick processes on fw2 machine are not running and hence when fw1 is down, the
*         entire replication process is stalled. can u do a ps and get the status of all the gluster processes and
*         ensure that the brick process is up on fw2.
I was away from this most of the day.  Here is a ps ax | grep gluster from both fw1 and fw2 while both nodes are online.

>From fw1:

[root at chicago-fw1 glusterfs]# ps ax | grep gluster
1019 ?        Ssl    0:09 /usr/sbin/glusterd -p /run/glusterd.pid
1274 ?        Ssl    0:32 /usr/sbin/glusterfsd -s 192.168.253.1 --volfile-id firewall-scripts.192.168.253.1.gluster-fw1 -p /var/lib/glusterd/vols/firewall-scripts/run/192.168.253.1-gluster-fw1.pid -S /var/run/3eea976403bb07230cae75b885406920.socket --brick-name /gluster-fw1 -l /var/log/glusterfs/bricks/gluster-fw1.log --xlator-option *-posix.glusterd-uuid=e13d53de-c7ed-4e63-bcb1-dc69ae25cc15 --brick-port 49152 --xlator-option firewall-scripts-server.listen-port=49152
1280 ?        Ssl    0:05 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/ec00b40c3ed179eccfdd89f5fcd540cc.socket
1285 ?        Ssl    0:05 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/fa9d586a6fab73a52bba6fc92ddd5d91.socket --xlator-option *replicate*.node-uuid=e13d53de-c7ed-4e63-bcb1-dc69ae25cc15
12649 ?        Ssl    0:00 /usr/sbin/glusterfs --volfile-id=/firewall-scripts --volfile-server=192.168.253.1 /firewall-scripts
12959 pts/1    S+     0:00 grep --color=auto gluster
[root at chicago-fw1 glusterfs]#

And from fw2:

[root at chicago-fw2 gregs]# ps ax | grep gluster
1027 ?        Ssl    0:11 /usr/sbin/glusterd -p /run/glusterd.pid
1291 ?        Ssl    0:14 /usr/sbin/glusterfsd -s 192.168.253.2 --volfile-id firewall-scripts.192.168.253.2.gluster-fw2 -p /var/lib/glusterd/vols/firewall-scripts/run/192.168.253.2-gluster-fw2.pid -S /var/run/380dca5c55990acea8ab30f5a08375a7.socket --brick-name /gluster-fw2 -l /var/log/glusterfs/bricks/gluster-fw2.log --xlator-option *-posix.glusterd-uuid=a2334360-d1d3-40c1-8c0e-7d62a5318899 --brick-port 49152 --xlator-option firewall-scripts-server.listen-port=49152
1306 ?        Ssl    0:06 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/12903cdbca94bee4abfc3b4df24e2e61.socket
1310 ?        Ssl    0:06 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/a2dee45b1271f43ae8a8d9003567b428.socket --xlator-option *replicate*.node-uuid=a2334360-d1d3-40c1-8c0e-7d62a5318899
12663 ?        Ssl    0:01 /usr/sbin/glusterfs --volfile-id=/firewall-scripts --volfile-server=192.168.253.2 /firewall-scripts
12958 pts/0    S+     0:00 grep --color=auto gluster
[root at chicago-fw2 gregs]#

*         Greg


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130710/e1481372/attachment.html>


More information about the Gluster-users mailing list