[Gluster-users] One node goes offline, the other node can't see the replicated volume anymore
Greg Scott
GregScott at infrasupport.com
Tue Jul 16 15:30:15 UTC 2013
Didn’t seem to make a difference. Not mounted right after logging in. Looks like the same behavior. The mount fails, then my rc.local kicks in and says it succeeded, but doesn’t show it mounted later when I do my “after” df –h.
[root at chicago-fw1 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/fedora-root 14G 3.9G 8.7G 31% /
devtmpfs 990M 0 990M 0% /dev
tmpfs 996M 0 996M 0% /dev/shm
tmpfs 996M 888K 996M 1% /run
tmpfs 996M 0 996M 0% /sys/fs/cgroup
tmpfs 996M 0 996M 0% /tmp
/dev/sda2 477M 87M 365M 20% /boot
/dev/sda1 200M 9.4M 191M 5% /boot/efi
/dev/mapper/fedora-gluster--fw1 7.9G 33M 7.8G 1% /gluster-fw1
[root at chicago-fw1 ~]#
[root at chicago-fw1 ~]# tail /var/log/messages -c 50000 | more
0.10.71.
Jul 16 10:21:23 chicago-fw1 avahi-daemon[446]: New relevant interface enp5s7.IPv4 for mDNS.
Jul 16 10:21:23 chicago-fw1 avahi-daemon[446]: Registering new address record for 10.10.10.71 on enp5s7.IPv4.
Jul 16 10:21:23 chicago-fw1 kernel: [ 22.284616] r8169 0000:05:04.0 enp5s4: link up
Jul 16 10:21:24 chicago-fw1 kernel: [ 22.996223] r8169 0000:05:07.0 enp5s7: link up
Jul 16 10:21:24 chicago-fw1 kernel: [ 22.996240] IPv6: ADDRCONF(NETDEV_CHANGE): enp5s7: link becomes ready
Jul 16 10:21:24 chicago-fw1 network[464]: Bringing up interface enp5s7: [ OK ]
Jul 16 10:21:25 chicago-fw1 systemd[1]: Started LSB: Bring up/down networking.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting Network.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Reached target Network.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Started Login and scanning of iSCSI devices.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting Vsftpd ftp daemon...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting RPC bind service...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting OpenSSH server daemon...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting /etc/rc.d/rc.local Compatibility...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Started RPC bind service.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting GlusterFS an clustered file-system server...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Started Vsftpd ftp daemon.
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Tue Jul 16 10:21:25 CDT 2013
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Sleeping 30 seconds.
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Tue Jul 16 10:21:25 CDT 2013
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Making sure the Gluster stuff is mounted
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Mounted before mount -av
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Filesystem Size Used Avail Use% Mounted on
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: /dev/mapper/fedora-root 14G 3.9G 8.7G 31% /
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: devtmpfs 990M 0 990M 0% /dev
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: tmpfs 996M 0 996M 0% /dev/shm
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: tmpfs 996M 2.1M 994M 1% /run
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: tmpfs 996M 0 996M 0% /sys/fs/cgroup
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: tmpfs 996M 0 996M 0% /tmp
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: /dev/sda2 477M 87M 365M 20% /boot
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: /dev/sda1 200M 9.4M 191M 5% /boot/efi
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: /dev/mapper/fedora-gluster--fw1 7.9G 33M 7.8G 1% /gluster-fw1
Jul 16 10:21:25 chicago-fw1 systemd[1]: Started OpenSSH server daemon.
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: extra arguments at end (ignored)
Jul 16 10:21:25 chicago-fw1 dbus-daemon[465]: dbus[465]: [system] Activating service name='org.fedoraproject.Setroubleshootd' (u
sing servicehelper)
Jul 16 10:21:25 chicago-fw1 dbus[465]: [system] Activating service name='org.fedoraproject.Setroubleshootd' (using servicehelper
)
Jul 16 10:21:25 chicago-fw1 kernel: [ 23.918403] fuse init (API version 7.21)
Jul 16 10:21:25 chicago-fw1 systemd[1]: Mounted /firewall-scripts.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting Remote File Systems.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Reached target Remote File Systems.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Mounting FUSE Control File System...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Mounted FUSE Control File System.
Jul 16 10:21:28 chicago-fw1 systemd[1]: Started Trigger Flushing of Journal to Persistent Storage.
Jul 16 10:21:28 chicago-fw1 systemd[1]: Starting Permit User Sessions...
Jul 16 10:21:28 chicago-fw1 systemd[1]: Started Permit User Sessions.
Jul 16 10:21:28 chicago-fw1 systemd[1]: Starting Command Scheduler...
Jul 16 10:21:28 chicago-fw1 systemd[1]: Started Command Scheduler.
Jul 16 10:21:28 chicago-fw1 systemd[1]: Starting Job spooling tools...
Jul 16 10:21:28 chicago-fw1 systemd[1]: Started Job spooling tools.
Jul 16 10:21:28 chicago-fw1 avahi-daemon[446]: Registering new address record for fe80::230:18ff:fea2:a340 on enp5s7.*.
Jul 16 10:21:28 chicago-fw1 dbus[465]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd'
Jul 16 10:21:28 chicago-fw1 dbus-daemon[465]: dbus[465]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd'
Jul 16 10:21:31 chicago-fw1 audispd: queue is full - dropping event
Jul 16 10:21:31 chicago-fw1 audispd: queue is full - dropping event
Jul 16 10:21:31 chicago-fw1 audispd: queue is full - dropping event
.
.
.
Jul 16 10:21:33 chicago-fw1 audispd: queue is full - dropping event
Jul 16 10:21:33 chicago-fw1 audispd: queue is full - dropping event
Jul 16 10:21:33 chicago-fw1 audispd: queue is full - dropping event
Jul 16 10:21:34 chicago-fw1 systemd[1]: Started GlusterFS an clustered file-system server.
Jul 16 10:21:34 chicago-fw1 systemd[1]: Starting Network is Online.
Jul 16 10:21:34 chicago-fw1 systemd[1]: Reached target Network is Online.
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: Mount failed. Please check the log file for more details.
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: / : ignored
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /boot : already mounted
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /boot/efi : already mounted
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /gluster-fw1 : already mounted
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: swap : ignored
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /firewall-scripts : successfully mounted
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: Mounted after mount -av
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: Filesystem Size Used Avail Use% Mounted on
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /dev/mapper/fedora-root 14G 3.9G 8.7G 31% /
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: devtmpfs 990M 0 990M 0% /dev
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: tmpfs 996M 0 996M 0% /dev/shm
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: tmpfs 996M 880K 996M 1% /run
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: tmpfs 996M 0 996M 0% /sys/fs/cgroup
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: tmpfs 996M 0 996M 0% /tmp
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /dev/sda2 477M 87M 365M 20% /boot
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /dev/sda1 200M 9.4M 191M 5% /boot/efi
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /dev/mapper/fedora-gluster--fw1 7.9G 33M 7.8G 1% /gluster-fw1
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: Starting up firewall common items
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started /etc/rc.d/rc.local Compatibility.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Terminate Plymouth Boot Screen...
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Wait for Plymouth Boot Screen to Quit...
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started Terminate Plymouth Boot Screen.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started Wait for Plymouth Boot Screen to Quit.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Getty on tty1...
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started Getty on tty1.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Login Prompts.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Reached target Login Prompts.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Reached target Multi-User System.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Update UTMP about System Runlevel Changes...
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Stop Read-Ahead Data Collection 10s After Completed Startup.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started Stop Read-Ahead Data Collection 10s After Completed Startup.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started Update UTMP about System Runlevel Changes.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Startup finished in 1.474s (kernel) + 2.210s (initrd) + 33.180s (userspace) = 36.866s.
[root at chicago-fw1 ~]# more /usr/lib/systemd/system/glusterd.service
[Unit]
Description=GlusterFS an clustered file-system server
After=network.target rpcbind.service
Before=network-online.target
[Service]
Type=forking
PIDFile=/run/glusterd.pid
LimitNOFILE=65536
ExecStart=/usr/sbin/glusterd -p /run/glusterd.pid
KillMode=process
[Install]
WantedBy=multi-user.target
[root at chicago-fw1 ~]#
- Greg
From: Joe Julian [mailto:joe at julianfamily.org]
Sent: Tuesday, July 16, 2013 10:09 AM
To: Greg Scott
Cc: gluster-users at gluster.org
Subject: Re: [Gluster-users] One node goes offline, the other node can't see the replicated volume anymore
Try this: https://gist.github.com/joejulian/6009570 see if it works any better. We're looking for " GlusterFS an clustered file-system server" to appear earlier than mounting.
On 07/15/2013 02:59 PM, Greg Scott wrote:
Hmmm - I turn off NetworkManager for my application but I can easily sleep a while in rc.local before doing mount -av and see what happens. And I will fix up glusterd.system. I'll report back here shortly.
- Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130716/78b44102/attachment.html>
More information about the Gluster-users
mailing list