[Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

Greg Scott GregScott at infrasupport.com
Mon Jul 15 19:22:46 UTC 2013


Well, none of my ideas worked.   I see that Gluster is up to the real 3.4.0 now.  No more beta.  So after a yum update and reboot of both fw1 and fw2, I decided to focus only on mounting my /firewall-scripts volume at startup time.  Forget about my application and taking a node offline and testing, let's just get the volume mounted properly first at startup time.   Cover the basics first.

I have an rc.local that mounts my filesystem and then runs a common script that lives inside that file system.  That line is commented out but systemd is apparently trying to execute it anyway.   Here is what /etc/rc.d/rc.local currently looks like, followed by an extract from /var/log/messages showing what actually happens.  Warning, it's ugly.  Viewer discretion is advised.  

#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
#
# Note removed by default starting in Fedora 16.

touch /var/lock/subsys/local

#***********************************
# Local stuff below

echo "Making sure the Gluster stuff is mounted"
echo "Mounted before mount -av"
df -h
mount -av
echo "Mounted after mount -av"
df -h
# The fstab mounts happen early in startup, then Gluster starts up later.
# By now, Gluster should be up and running and the mounts should work.
# That _netdev option is supposed to account for the delay but doesn't seem
# to work right.

echo "Starting up firewall common items"
##/firewall-scripts/etc/rc.d/common-rc.local

[root at chicago-fw2 rc.d]#

And here is the extract from /var/log/messages on fw1 showing what actually happens.  The log on fw2 is similar.  

Jul 15 13:49:59 chicago-fw1 audispd: queue is full - dropping event
Jul 15 13:49:59 chicago-fw1 audispd: queue is full - dropping event
Jul 15 13:49:59 chicago-fw1 audispd: queue is full - dropping event
Jul 15 13:50:00 chicago-fw1 setroubleshoot: SELinux is preventing /usr/sbin/glusterfsd from 'read, write' accesses on the chr_file fuse. For complete SELinux messages. run sealert -l ff532d9a-f5$
Jul 15 13:50:01 chicago-fw1 systemd[1]: Started GlusterFS an clustered file-system server.
Jul 15 13:50:01 chicago-fw1 systemd[1]: Starting GlusterFS an clustered file-system server...
Jul 15 13:50:01 chicago-fw1 glusterfsd[1255]: [2013-07-15 18:50:01.409064] C [glusterfsd.c:1374:parse_cmdline] 0-glusterfs: ERROR: parsing the volfile failed (No such file or directory)
Jul 15 13:50:01 chicago-fw1 glusterfsd[1255]: USAGE: /usr/sbin/glusterfsd [options] [mountpoint]
Jul 15 13:50:01 chicago-fw1 GlusterFS[1255]: [2013-07-15 18:50:01.409064] C [glusterfsd.c:1374:parse_cmdline] 0-glusterfs: ERROR: parsing the volfile failed (No such file or directory)
Jul 15 13:50:01 chicago-fw1 systemd[1]: glusterfsd.service: control process exited, code=exited status=255
Jul 15 13:50:01 chicago-fw1 systemd[1]: Failed to start GlusterFS an clustered file-system server.
Jul 15 13:50:01 chicago-fw1 systemd[1]: Unit glusterfsd.service entered failed state.
Jul 15 13:50:04 chicago-fw1 mount[1002]: Mount failed. Please check the log file for more details.
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: Mount failed. Please check the log file for more details.
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /                        : ignored
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /boot                    : already mounted
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /boot/efi                : already mounted
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /gluster-fw1             : already mounted
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: swap                     : ignored
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /firewall-scripts        : successfully mounted
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: Mounted after mount -av
Jul 15 13:50:04 chicago-fw1 systemd[1]: firewall\x2dscripts.mount mount process exited, code=exited status=1
Jul 15 13:50:04 chicago-fw1 systemd[1]: Unit firewall\x2dscripts.mount entered failed state.
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: Filesystem                       Size  Used Avail Use% Mounted on
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /dev/mapper/fedora-root           14G  3.8G  8.7G  31% /
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: devtmpfs                         990M     0  990M   0% /dev
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: tmpfs                            996M     0  996M   0% /dev/shm
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: tmpfs                            996M  872K  996M   1% /run
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: tmpfs                            996M     0  996M   0% /sys/fs/cgroup
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: tmpfs                            996M     0  996M   0% /tmp
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /dev/sda2                        477M   87M  365M  20% /boot
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /dev/sda1                        200M  9.4M  191M   5% /boot/efi
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /dev/mapper/fedora-gluster--fw1  7.9G   33M  7.8G   1% /gluster-fw1
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /etc/rc.d/rc.local: line 26: /firewall-scripts/etc/rc.d/common-rc.local: No such file or directory
Jul 15 13:50:04 chicago-fw1 systemd[1]: rc-local.service: control process exited, code=exited status=127
Jul 15 13:50:04 chicago-fw1 systemd[1]: Failed to start /etc/rc.d/rc.local Compatibility.
Jul 15 13:50:04 chicago-fw1 systemd[1]: Unit rc-local.service entered failed state.
Jul 15 13:50:04 chicago-fw1 systemd[1]: Starting Terminate Plymouth Boot Screen...
Jul 15 13:50:04 chicago-fw1 systemd[1]: Starting Wait for Plymouth Boot Screen to Quit...





More information about the Gluster-users mailing list