[Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

Greg Scott GregScott at infrasupport.com
Tue Jul 16 15:30:15 UTC 2013


Didn’t seem to make a difference.   Not mounted right after logging in.  Looks like the same behavior.  The mount fails, then my rc.local kicks in and says it succeeded, but doesn’t show it mounted later when I do my “after” df –h.

[root at chicago-fw1 ~]# df -h
Filesystem                       Size  Used Avail Use% Mounted on
/dev/mapper/fedora-root           14G  3.9G  8.7G  31% /
devtmpfs                         990M     0  990M   0% /dev
tmpfs                            996M     0  996M   0% /dev/shm
tmpfs                            996M  888K  996M   1% /run
tmpfs                            996M     0  996M   0% /sys/fs/cgroup
tmpfs                            996M     0  996M   0% /tmp
/dev/sda2                        477M   87M  365M  20% /boot
/dev/sda1                        200M  9.4M  191M   5% /boot/efi
/dev/mapper/fedora-gluster--fw1  7.9G   33M  7.8G   1% /gluster-fw1
[root at chicago-fw1 ~]#

[root at chicago-fw1 ~]# tail /var/log/messages -c 50000 | more
0.10.71.
Jul 16 10:21:23 chicago-fw1 avahi-daemon[446]: New relevant interface enp5s7.IPv4 for mDNS.
Jul 16 10:21:23 chicago-fw1 avahi-daemon[446]: Registering new address record for 10.10.10.71 on enp5s7.IPv4.
Jul 16 10:21:23 chicago-fw1 kernel: [   22.284616] r8169 0000:05:04.0 enp5s4: link up
Jul 16 10:21:24 chicago-fw1 kernel: [   22.996223] r8169 0000:05:07.0 enp5s7: link up
Jul 16 10:21:24 chicago-fw1 kernel: [   22.996240] IPv6: ADDRCONF(NETDEV_CHANGE): enp5s7: link becomes ready
Jul 16 10:21:24 chicago-fw1 network[464]: Bringing up interface enp5s7:  [  OK  ]
Jul 16 10:21:25 chicago-fw1 systemd[1]: Started LSB: Bring up/down networking.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting Network.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Reached target Network.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Started Login and scanning of iSCSI devices.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting Vsftpd ftp daemon...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting RPC bind service...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting OpenSSH server daemon...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting /etc/rc.d/rc.local Compatibility...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Started RPC bind service.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting GlusterFS an clustered file-system server...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Started Vsftpd ftp daemon.
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Tue Jul 16 10:21:25 CDT 2013
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Sleeping 30 seconds.
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Tue Jul 16 10:21:25 CDT 2013
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Making sure the Gluster stuff is mounted
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Mounted before mount -av
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: Filesystem                       Size  Used Avail Use% Mounted on
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: /dev/mapper/fedora-root           14G  3.9G  8.7G  31% /
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: devtmpfs                         990M     0  990M   0% /dev
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: tmpfs                            996M     0  996M   0% /dev/shm
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: tmpfs                            996M  2.1M  994M   1% /run
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: tmpfs                            996M     0  996M   0% /sys/fs/cgroup
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: tmpfs                            996M     0  996M   0% /tmp
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: /dev/sda2                        477M   87M  365M  20% /boot
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: /dev/sda1                        200M  9.4M  191M   5% /boot/efi
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: /dev/mapper/fedora-gluster--fw1  7.9G   33M  7.8G   1% /gluster-fw1
Jul 16 10:21:25 chicago-fw1 systemd[1]: Started OpenSSH server daemon.
Jul 16 10:21:25 chicago-fw1 rc.local[1005]: extra arguments at end (ignored)
Jul 16 10:21:25 chicago-fw1 dbus-daemon[465]: dbus[465]: [system] Activating service name='org.fedoraproject.Setroubleshootd' (u
sing servicehelper)
Jul 16 10:21:25 chicago-fw1 dbus[465]: [system] Activating service name='org.fedoraproject.Setroubleshootd' (using servicehelper
)
Jul 16 10:21:25 chicago-fw1 kernel: [   23.918403] fuse init (API version 7.21)
Jul 16 10:21:25 chicago-fw1 systemd[1]: Mounted /firewall-scripts.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting Remote File Systems.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Reached target Remote File Systems.
Jul 16 10:21:25 chicago-fw1 systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Mounting FUSE Control File System...
Jul 16 10:21:25 chicago-fw1 systemd[1]: Mounted FUSE Control File System.
Jul 16 10:21:28 chicago-fw1 systemd[1]: Started Trigger Flushing of Journal to Persistent Storage.
Jul 16 10:21:28 chicago-fw1 systemd[1]: Starting Permit User Sessions...
Jul 16 10:21:28 chicago-fw1 systemd[1]: Started Permit User Sessions.
Jul 16 10:21:28 chicago-fw1 systemd[1]: Starting Command Scheduler...
Jul 16 10:21:28 chicago-fw1 systemd[1]: Started Command Scheduler.
Jul 16 10:21:28 chicago-fw1 systemd[1]: Starting Job spooling tools...
Jul 16 10:21:28 chicago-fw1 systemd[1]: Started Job spooling tools.
Jul 16 10:21:28 chicago-fw1 avahi-daemon[446]: Registering new address record for fe80::230:18ff:fea2:a340 on enp5s7.*.
Jul 16 10:21:28 chicago-fw1 dbus[465]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd'
Jul 16 10:21:28 chicago-fw1 dbus-daemon[465]: dbus[465]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd'
Jul 16 10:21:31 chicago-fw1 audispd: queue is full - dropping event
Jul 16 10:21:31 chicago-fw1 audispd: queue is full - dropping event
Jul 16 10:21:31 chicago-fw1 audispd: queue is full - dropping event
.
.
.
Jul 16 10:21:33 chicago-fw1 audispd: queue is full - dropping event
Jul 16 10:21:33 chicago-fw1 audispd: queue is full - dropping event
Jul 16 10:21:33 chicago-fw1 audispd: queue is full - dropping event
Jul 16 10:21:34 chicago-fw1 systemd[1]: Started GlusterFS an clustered file-system server.
Jul 16 10:21:34 chicago-fw1 systemd[1]: Starting Network is Online.
Jul 16 10:21:34 chicago-fw1 systemd[1]: Reached target Network is Online.
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: Mount failed. Please check the log file for more details.
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /                        : ignored
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /boot                    : already mounted
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /boot/efi                : already mounted
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /gluster-fw1             : already mounted
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: swap                     : ignored
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /firewall-scripts        : successfully mounted
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: Mounted after mount -av
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: Filesystem                       Size  Used Avail Use% Mounted on
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /dev/mapper/fedora-root           14G  3.9G  8.7G  31% /
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: devtmpfs                         990M     0  990M   0% /dev
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: tmpfs                            996M     0  996M   0% /dev/shm
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: tmpfs                            996M  880K  996M   1% /run
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: tmpfs                            996M     0  996M   0% /sys/fs/cgroup
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: tmpfs                            996M     0  996M   0% /tmp
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /dev/sda2                        477M   87M  365M  20% /boot
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /dev/sda1                        200M  9.4M  191M   5% /boot/efi
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: /dev/mapper/fedora-gluster--fw1  7.9G   33M  7.8G   1% /gluster-fw1
Jul 16 10:21:38 chicago-fw1 rc.local[1005]: Starting up firewall common items
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started /etc/rc.d/rc.local Compatibility.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Terminate Plymouth Boot Screen...
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Wait for Plymouth Boot Screen to Quit...
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started Terminate Plymouth Boot Screen.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started Wait for Plymouth Boot Screen to Quit.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Getty on tty1...
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started Getty on tty1.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Login Prompts.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Reached target Login Prompts.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Reached target Multi-User System.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Update UTMP about System Runlevel Changes...
Jul 16 10:21:38 chicago-fw1 systemd[1]: Starting Stop Read-Ahead Data Collection 10s After Completed Startup.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started Stop Read-Ahead Data Collection 10s After Completed Startup.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Started Update UTMP about System Runlevel Changes.
Jul 16 10:21:38 chicago-fw1 systemd[1]: Startup finished in 1.474s (kernel) + 2.210s (initrd) + 33.180s (userspace) = 36.866s.


[root at chicago-fw1 ~]# more /usr/lib/systemd/system/glusterd.service
[Unit]
Description=GlusterFS an clustered file-system server
After=network.target rpcbind.service
Before=network-online.target

[Service]
Type=forking
PIDFile=/run/glusterd.pid
LimitNOFILE=65536
ExecStart=/usr/sbin/glusterd -p /run/glusterd.pid
KillMode=process

[Install]
WantedBy=multi-user.target
[root at chicago-fw1 ~]#


-          Greg

From: Joe Julian [mailto:joe at julianfamily.org]
Sent: Tuesday, July 16, 2013 10:09 AM
To: Greg Scott
Cc: gluster-users at gluster.org
Subject: Re: [Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

Try this: https://gist.github.com/joejulian/6009570 see if it works any better. We're looking for " GlusterFS an clustered file-system server" to appear earlier than mounting.

On 07/15/2013 02:59 PM, Greg Scott wrote:

Hmmm - I turn off NetworkManager for my application but I can easily sleep a while in rc.local before doing mount -av and see what happens.  And I will fix up glusterd.system.  I'll report back here shortly.



- Greg



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130716/78b44102/attachment.html>


More information about the Gluster-users mailing list