[Gluster-users] Mount fails after outage

Gurdeep Singh (Guru) guru at bazaari.com.au
Sun Oct 5 04:09:21 UTC 2014


Hello,

There was an outage on one of our servers and after the reboot, mounting of the folder fails on that server with an error message:

-bash-4.1# mount -t glusterfs srv1:/gv0 /var/www/html/image/
Mount failed. Please check the log file for more details.
-bash-4.1# 

Looking at the log file glustershd.log file, I see the following:

Final graph:
+------------------------------------------------------------------------------+
  1: volume gv0-client-0
  2:     type protocol/client
  3:     option remote-host srv1
  4:     option remote-subvolume /root/gluster-vol0
  5:     option transport-type socket
  6:     option username 300c24e9-ac51-4735-b1ee-7acdd985ccd5
  7:     option password 989d61f9-8393-4402-8d3f-988d18e832a6
  8: end-volume
  9: 
 10: volume gv0-client-1
 11:     type protocol/client
 12:     option remote-host srv2
 13:     option remote-subvolume /root/gluster-vol0
 14:     option transport-type socket
 15:     option username 300c24e9-ac51-4735-b1ee-7acdd985ccd5
 16:     option password 989d61f9-8393-4402-8d3f-988d18e832a6
 17: end-volume
 18: 
 19: volume gv0-replicate-0
 20:     type cluster/replicate
 21:     option node-uuid c531d907-2f86-4bec-9ae7-8318e28295bc
 22:     option background-self-heal-count 0
 23:     option metadata-self-heal on
 24:     option data-self-heal on
 25:     option entry-self-heal on
 26:     option self-heal-daemon on
 27:     option iam-self-heal-daemon yes
 28:     subvolumes gv0-client-0 gv0-client-1
 29: end-volume
 30: 
 31: volume glustershd
 32:     type debug/io-stats
 33:     subvolumes gv0-replicate-0
 34: end-volume
 35: 
+------------------------------------------------------------------------------+
[2014-10-05 03:54:30.790905] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-gv0-client-0: changing port to 49152 (from 0)
[2014-10-05 03:54:30.798689] I [client-handshake.c:1659:select_server_supported_programs] 0-gv0-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2014-10-05 03:54:30.805120] I [client-handshake.c:1456:client_setvolume_cbk] 0-gv0-client-0: Connected to 127.0.0.1:49152, attached to remote volume '/root/gluster-vol0'.
[2014-10-05 03:54:30.805163] I [client-handshake.c:1468:client_setvolume_cbk] 0-gv0-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2014-10-05 03:54:30.805250] I [afr-common.c:4120:afr_notify] 0-gv0-replicate-0: Subvolume 'gv0-client-0' came back up; going online.
[2014-10-05 03:54:30.807784] I [client-handshake.c:450:client_set_lk_version_cbk] 0-gv0-client-0: Server lk version = 1
[2014-10-05 03:54:30.808566] I [afr-self-heald.c:1687:afr_dir_exclusive_crawl] 0-gv0-replicate-0: Another crawl is in progress for gv0-client-0
[2014-10-05 03:54:30.808614] E [afr-self-heald.c:1479:afr_find_child_position] 0-gv0-replicate-0: getxattr failed on gv0-client-1 - (Transport endpoint is not connected)
[2014-10-05 03:54:30.818679] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-gv0-client-1: changing port to 49152 (from 0)
[2014-10-05 03:54:30.828616] I [client-handshake.c:1659:select_server_supported_programs] 0-gv0-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2014-10-05 03:54:30.844354] I [client-handshake.c:1456:client_setvolume_cbk] 0-gv0-client-1: Connected to 10.8.0.6:49152, attached to remote volume '/root/gluster-vol0'.
[2014-10-05 03:54:30.844388] I [client-handshake.c:1468:client_setvolume_cbk] 0-gv0-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2014-10-05 03:54:30.849128] I [client-handshake.c:450:client_set_lk_version_cbk] 0-gv0-client-1: Server lk version = 1

There is a nfs.log file that shows this:

Final graph:
+------------------------------------------------------------------------------+
  1: volume gv0-client-0
  2:     type protocol/client
  3:     option remote-host srv1
  4:     option remote-subvolume /root/gluster-vol0
  5:     option transport-type socket
  6:     option username 300c24e9-ac51-4735-b1ee-7acdd985ccd5
  7:     option password 989d61f9-8393-4402-8d3f-988d18e832a6
  8:     option send-gids true
  9: end-volume
 10: 
 11: volume gv0-client-1
 12:     type protocol/client
 13:     option remote-host srv2
 14:     option remote-subvolume /root/gluster-vol0
 15:     option transport-type socket
 16:     option username 300c24e9-ac51-4735-b1ee-7acdd985ccd5
 17:     option password 989d61f9-8393-4402-8d3f-988d18e832a6
 18:     option send-gids true
 19: end-volume
 20: 
 21: volume gv0-replicate-0
 22:     type cluster/replicate
 23:     subvolumes gv0-client-0 gv0-client-1
 24: end-volume
 25: 
 26: volume gv0-dht
 27:     type cluster/distribute
 28:     option lookup-unhashed on
 29:     subvolumes gv0-replicate-0
 30: end-volume
 31: 
 32: volume gv0-write-behind
 33:     type performance/write-behind
 34:     subvolumes gv0-dht
 35: end-volume
 36: 
 37: volume gv0
 38:     type debug/io-stats
 39:     option latency-measurement off
 40:     option count-fop-hits off
 41:     subvolumes gv0-write-behind
 42: end-volume
 43: 
 44: volume nfs-server
 45:     type nfs/server
 46:     option rpc-auth.auth-glusterfs on
 47:     option rpc-auth.auth-unix on
 48:     option rpc-auth.auth-null on
 49:     option rpc-auth.ports.insecure on
 50:     option rpc-auth-allow-insecure on
 51:     option transport-type socket
 52:     option transport.socket.listen-port 2049
 53:     option nfs.dynamic-volumes on
 54:     option nfs.nlm on
 55:     option nfs.drc off
 56:     option rpc-auth.addr.gv0.allow *
 57:     option nfs3.gv0.volume-id dc8dc3f2-f5bd-4047-9101-acad04695442
 58:     subvolumes gv0
 59: end-volume
 60: 
+------------------------------------------------------------------------------+
[2014-10-05 03:54:30.832422] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-gv0-client-0: changing port to 49152 (from 0)
[2014-10-05 03:54:30.835888] I [client-handshake.c:1659:select_server_supported_programs] 0-gv0-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2014-10-05 03:54:30.836157] I [client-handshake.c:1456:client_setvolume_cbk] 0-gv0-client-0: Connected to 127.0.0.1:49152, attached to remote volume '/root/gluster-vol0'.
[2014-10-05 03:54:30.836174] I [client-handshake.c:1468:client_setvolume_cbk] 0-gv0-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2014-10-05 03:54:30.836393] I [afr-common.c:4120:afr_notify] 0-gv0-replicate-0: Subvolume 'gv0-client-0' came back up; going online.
[2014-10-05 03:54:30.836430] I [client-handshake.c:450:client_set_lk_version_cbk] 0-gv0-client-0: Server lk version = 1
[2014-10-05 03:54:30.839191] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-gv0-client-1: changing port to 49152 (from 0)
[2014-10-05 03:54:30.850953] I [client-handshake.c:1659:select_server_supported_programs] 0-gv0-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2014-10-05 03:54:30.851821] I [client-handshake.c:1456:client_setvolume_cbk] 0-gv0-client-1: Connected to 10.8.0.6:49152, attached to remote volume '/root/gluster-vol0'.
[2014-10-05 03:54:30.851843] I [client-handshake.c:1468:client_setvolume_cbk] 0-gv0-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2014-10-05 03:54:30.853062] I [client-handshake.c:450:client_set_lk_version_cbk] 0-gv0-client-1: Server lk version = 1

srv1 (10.8.0.1) is also a VPN server that the srv2 (10.8.0.6) connects to. 

The volume on srv1,srv2 seems to be up:

-bash-4.1# gluster volume info
 
Volume Name: gv0
Type: Replicate
Volume ID: dc8dc3f2-f5bd-4047-9101-acad04695442
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: srv1:/root/gluster-vol0
Brick2: srv2:/root/gluster-vol0
Options Reconfigured:
cluster.lookup-unhashed: on
performance.cache-refresh-timeout: 60
performance.cache-size: 1GB
storage.health-check-interval: 30

[guru at srv2 ~]$ sudo gluster volume info
 
Volume Name: gv0
Type: Replicate
Volume ID: dc8dc3f2-f5bd-4047-9101-acad04695442
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: srv1:/root/gluster-vol0
Brick2: srv2:/root/gluster-vol0
Options Reconfigured:
cluster.lookup-unhashed: on
performance.cache-refresh-timeout: 60
performance.cache-size: 1GB
storage.health-check-interval: 30
[guru at srv2 ~]$ 


But, still I am not able to mount the folder into the volume.

Please suggest how can we troubleshoot this issue.

Regards,
Guru.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141005/71795e05/attachment.html>


More information about the Gluster-users mailing list