[Gluster-users] Write operations failing on clients

Ben Turner bturner at redhat.com
Thu Apr 30 21:10:39 UTC 2015


Are your files split brained:

gluster v heal img info split-brain

I see alot of problem with your self heal daemon connecting:

[2015-04-29 16:15:37.137215] E [socket.c:2161:socket_connect_finish] 0-img-client-4: connection to 192.168.114.185:49154 failed (Connection refused)
[2015-04-29 16:15:37.434035] E [client-handshake.c:1760:client_query_portmap_cbk] 0-img-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-04-29 16:15:40.308730] E [afr-self-heald.c:1479:afr_find_child_position] 0-img-replicate-2: getxattr failed on img-client-5 - (Transport endpoint is not connected)
[2015-04-29 16:15:40.308878] E [afr-self-heald.c:1479:afr_find_child_position] 0-img-replicate-1: getxattr failed on img-client-3 - (Transport endpoint is not connected)
[2015-04-29 16:15:41.192965] E [client-handshake.c:1760:client_query_portmap_cbk] 0-img-client-3: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-04-29 16:20:23.184879] E [socket.c:2161:socket_connect_finish] 0-img-client-1: connection to 192.168.114.182:24007 failed (Connection refused)
[2015-04-29 16:21:01.684625] E [client-handshake.c:1760:client_query_portmap_cbk] 0-img-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-04-29 16:24:14.211163] E [socket.c:2161:socket_connect_finish] 0-img-client-1: connection to 192.168.114.182:49152 failed (Connection refused)
[2015-04-29 16:24:18.213126] E [socket.c:2161:socket_connect_finish] 0-img-client-1: connection to 192.168.114.182:49152 failed (Connection refused)
[2015-04-29 16:24:22.212902] E [socket.c:2161:socket_connect_finish] 0-img-client-1: connection to 192.168.114.182:49152 failed (Connection refused)
[2015-04-29 16:24:26.213708] E [socket.c:2161:socket_connect_finish] 0-img-client-1: connection to 192.168.114.182:49152 failed (Connection refused)
[2015-04-29 16:24:30.214324] E [socket.c:2161:socket_connect_finish] 0-img-client-1: connection to 192.168.114.182:49152 failed (Connection refused)
[2015-04-29 16:24:34.214816] E [socket.c:2161:socket_connect_finish] 0-img-client-1: connection to 192.168.114.182:49152 failed (Connection refused)

There looks to have been some network flapping up and down and files may have become split brianed.  Whenever you are bouncing services I usually:

$ service glusterd stop
$ killall glusterfs
$ killall glusterfsd
$ ps aux | grep glu  <- Make sure evertyhing is actually cleaned up

Anytime you take a node offline and back online make sure the files get resynced with a self heal before you take offline any other nodes:

$ gluster v heal img full

If you do see split brained files you can resolve with:

http://blog.gluster.org/category/howtos/
https://joejulian.name/blog/fixing-split-brain-with-glusterfs-33/

LMK if you see any split brained files.

-b

----- Original Message -----
> From: "Alex" <alex.m at icecat.biz>
> To: gluster-users at gluster.org
> Sent: Thursday, April 30, 2015 9:26:04 AM
> Subject: Re: [Gluster-users] Write operations failing on clients
> 
> Oh and this is output of some status commands:
> http://termbin.com/bvzz
> 
> Mount\umount worked just fine.
> 
> Alex
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 


More information about the Gluster-users mailing list