[Gluster-users] Mount problems when secondary node down

Tue Nov 11 01:19:50 UTC 2014

On 11/10/2014 11:47 PM, A F wrote:
> Hello,
>
> I have two servers, 192.168.0.10 and 192.168.2.10. I'm using gluster 
> 3.6.1 (installed from gluster repo) on AWS Linux. Both servers are 
> completely reachable in LAN.
> # rpm -qa|grep gluster
> glusterfs-3.6.1-1.el6.x86_64
> glusterfs-server-3.6.1-1.el6.x86_64
> glusterfs-libs-3.6.1-1.el6.x86_64
> glusterfs-api-3.6.1-1.el6.x86_64
> glusterfs-cli-3.6.1-1.el6.x86_64
> glusterfs-fuse-3.6.1-1.el6.x86_64
>
> These are the commands I ran:
> # gluster peer probe 192.168.2.10
> # gluster volume create aloha replica 2 transport tcp 
> 192.168.0.10:/var/aloha 192.168.2.10:/var/aloha force
> # gluster volume start aloha
> # gluster volume set aloha network.ping-timeout 5
> # gluster volume set aloha nfs.disable on
>
> Problem number 1:
> tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log shows log 
> cluttering with:
> [2014-11-10 17:41:26.328796] W [socket.c:611:__socket_rwv] 
> 0-management: readv on 
> /var/run/38c520c774793c9cdae8ace327512027.socket failed (Invalid 
> argument)
> this happens every 3 seconds on both servers. It is related to NFS and 
> probably rpcbind, but I absolutely want them disabled. As you see, 
> I've set gluster to disable nfs - why doesn't it keep quiet about it 
> then?
>
> Problem number 2:
> in fstab on server 192.168.0.10:   192.168.0.10:/aloha /var/www/hawaii 
>      glusterfs       defaults,_netdev        0 0
> in fstab on server 192.168.2.10:   192.168.2.10:/aloha 
> /var/www/hawaii      glusterfs       defaults,_netdev        0 0
>
> If I shutdown one of the servers (192.168.2.10), and I reboot the 
> remaining one (192.168.0.10), it won't come up as fast as it should. 
> It lags a few minutes waiting for gluster. After it eventually starts, 
> mount point is not mounted and volume is stopped:
> # gluster volume status
> Status of volume: aloha
> Gluster process                                         Port Online  Pid
> ------------------------------------------------------------------------------ 
>
> Brick 192.168.0.10:/var/aloha                           N/A N       N/A
> Self-heal Daemon on localhost                           N/A N       N/A
>
> Task Status of Volume aloha
> ------------------------------------------------------------------------------ 
>
> There are no active volume tasks
>
> This didn't happen before, so fine, I first have to stop the volume 
> and then start it again. It now shows as online:
> Brick 192.168.0.10:/var/aloha                           49155 Y       
> 3473
> Self-heal Daemon on localhost                           N/A Y       3507
>
> # time mount -a
> real    2m7.307s
>
> # time mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii
> real    2m7.365s
>
> # strace mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii
> (attached)
>
> # tail /var/log/glusterfs/* -f|grep -v readv
> (attached)
>
> I've done this setup before, so I'm amazed it doesn't work. I even 
> have it in production at the moment, with the same options and setup, 
> and for example I'm not getting readv errors. I'm unable to test the 
> mount part though, but I feel I have covered it way back when I was 
> testing the environment.
> Any help is kindly appreciated.
CC glusterd folks

Pranith
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141111/b43ae35f/attachment.html>