[Gluster-users] One node goes offline, the other node can't see the replicated volume anymore
Greg Scott
GregScott at infrasupport.com
Mon Jul 15 19:22:46 UTC 2013
Well, none of my ideas worked. I see that Gluster is up to the real 3.4.0 now. No more beta. So after a yum update and reboot of both fw1 and fw2, I decided to focus only on mounting my /firewall-scripts volume at startup time. Forget about my application and taking a node offline and testing, let's just get the volume mounted properly first at startup time. Cover the basics first.
I have an rc.local that mounts my filesystem and then runs a common script that lives inside that file system. That line is commented out but systemd is apparently trying to execute it anyway. Here is what /etc/rc.d/rc.local currently looks like, followed by an extract from /var/log/messages showing what actually happens. Warning, it's ugly. Viewer discretion is advised.
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
#
# Note removed by default starting in Fedora 16.
touch /var/lock/subsys/local
#***********************************
# Local stuff below
echo "Making sure the Gluster stuff is mounted"
echo "Mounted before mount -av"
df -h
mount -av
echo "Mounted after mount -av"
df -h
# The fstab mounts happen early in startup, then Gluster starts up later.
# By now, Gluster should be up and running and the mounts should work.
# That _netdev option is supposed to account for the delay but doesn't seem
# to work right.
echo "Starting up firewall common items"
##/firewall-scripts/etc/rc.d/common-rc.local
[root at chicago-fw2 rc.d]#
And here is the extract from /var/log/messages on fw1 showing what actually happens. The log on fw2 is similar.
Jul 15 13:49:59 chicago-fw1 audispd: queue is full - dropping event
Jul 15 13:49:59 chicago-fw1 audispd: queue is full - dropping event
Jul 15 13:49:59 chicago-fw1 audispd: queue is full - dropping event
Jul 15 13:50:00 chicago-fw1 setroubleshoot: SELinux is preventing /usr/sbin/glusterfsd from 'read, write' accesses on the chr_file fuse. For complete SELinux messages. run sealert -l ff532d9a-f5$
Jul 15 13:50:01 chicago-fw1 systemd[1]: Started GlusterFS an clustered file-system server.
Jul 15 13:50:01 chicago-fw1 systemd[1]: Starting GlusterFS an clustered file-system server...
Jul 15 13:50:01 chicago-fw1 glusterfsd[1255]: [2013-07-15 18:50:01.409064] C [glusterfsd.c:1374:parse_cmdline] 0-glusterfs: ERROR: parsing the volfile failed (No such file or directory)
Jul 15 13:50:01 chicago-fw1 glusterfsd[1255]: USAGE: /usr/sbin/glusterfsd [options] [mountpoint]
Jul 15 13:50:01 chicago-fw1 GlusterFS[1255]: [2013-07-15 18:50:01.409064] C [glusterfsd.c:1374:parse_cmdline] 0-glusterfs: ERROR: parsing the volfile failed (No such file or directory)
Jul 15 13:50:01 chicago-fw1 systemd[1]: glusterfsd.service: control process exited, code=exited status=255
Jul 15 13:50:01 chicago-fw1 systemd[1]: Failed to start GlusterFS an clustered file-system server.
Jul 15 13:50:01 chicago-fw1 systemd[1]: Unit glusterfsd.service entered failed state.
Jul 15 13:50:04 chicago-fw1 mount[1002]: Mount failed. Please check the log file for more details.
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: Mount failed. Please check the log file for more details.
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: / : ignored
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /boot : already mounted
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /boot/efi : already mounted
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /gluster-fw1 : already mounted
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: swap : ignored
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /firewall-scripts : successfully mounted
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: Mounted after mount -av
Jul 15 13:50:04 chicago-fw1 systemd[1]: firewall\x2dscripts.mount mount process exited, code=exited status=1
Jul 15 13:50:04 chicago-fw1 systemd[1]: Unit firewall\x2dscripts.mount entered failed state.
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: Filesystem Size Used Avail Use% Mounted on
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /dev/mapper/fedora-root 14G 3.8G 8.7G 31% /
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: devtmpfs 990M 0 990M 0% /dev
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: tmpfs 996M 0 996M 0% /dev/shm
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: tmpfs 996M 872K 996M 1% /run
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: tmpfs 996M 0 996M 0% /sys/fs/cgroup
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: tmpfs 996M 0 996M 0% /tmp
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /dev/sda2 477M 87M 365M 20% /boot
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /dev/sda1 200M 9.4M 191M 5% /boot/efi
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /dev/mapper/fedora-gluster--fw1 7.9G 33M 7.8G 1% /gluster-fw1
Jul 15 13:50:04 chicago-fw1 rc.local[1006]: /etc/rc.d/rc.local: line 26: /firewall-scripts/etc/rc.d/common-rc.local: No such file or directory
Jul 15 13:50:04 chicago-fw1 systemd[1]: rc-local.service: control process exited, code=exited status=127
Jul 15 13:50:04 chicago-fw1 systemd[1]: Failed to start /etc/rc.d/rc.local Compatibility.
Jul 15 13:50:04 chicago-fw1 systemd[1]: Unit rc-local.service entered failed state.
Jul 15 13:50:04 chicago-fw1 systemd[1]: Starting Terminate Plymouth Boot Screen...
Jul 15 13:50:04 chicago-fw1 systemd[1]: Starting Wait for Plymouth Boot Screen to Quit...
More information about the Gluster-users
mailing list