[Gluster-users] Upgrading from 3.6.2-1 to 3.6.2-2 causes "failed to get the 'volume file' from server"
Niels de Vos
ndevos at redhat.com
Thu Feb 26 07:59:23 UTC 2015
On Wed, Feb 25, 2015 at 11:52:02AM -0800, Michael Bushey wrote:
> On a Debian testing glusterfs cluster, one node of six (web1) was
> upgraded from 3.6.2-1 to 3.6.2-2. All looks good server side, and
> gdash looks happy. The problem is this node is no longer able to mount
> the volumes. The server config is in Ansible so the nodes should be
> consistent.
This sounds very much like this issue:
www.gluster.org/pipermail/gluster-users/2015-February/020781.html
We're now working on getting packagers of different distributions
aligned and better informed. In the future, packaging differences like
this should be identified earlier and issues prevented.
HTH,
Niels
>
>
> web1# mount -t glusterfs localhost:/site-private
> Mount failed. Please check the log file for more details.
>
>
> web1# gluster volume info site-private
>
> Volume Name: site-private
> Type: Distributed-Replicate
> Volume ID: 53cb154d-7e44-439f-b52c-ca10414327cb
> Status: Started
> Number of Bricks: 2 x 2 = 4
> Transport-type: tcp
> Bricks:
> Brick1: web4:/var/gluster/site-private
> Brick2: web5:/var/gluster/site-private
> Brick3: web3:/var/gluster/site-private
> Brick4: webw:/var/gluster/site-private
> Options Reconfigured:
> nfs.disable: on
> auth.allow: 10.*
>
> web1# gluster volume status site-private
> Status of volume: site-private
> Gluster process Port Online Pid
> ------------------------------------------------------------------------------
> Brick web4:/var/gluster/site-private 49152 Y 18544
> Brick web5:/var/gluster/site-private 49152 Y 3460
> Brick web3:/var/gluster/site-private 49152 Y 1171
> Brick webw:/var/gluster/site-private 49152 Y 8954
> Self-heal Daemon on localhost N/A Y 1410
> Self-heal Daemon on web3 N/A Y 6394
> Self-heal Daemon on web5 N/A Y 3726
> Self-heal Daemon on web4 N/A Y 18928
> Self-heal Daemon on 10.0.0.22 N/A Y 3601
> Self-heal Daemon on 10.0.0.153 N/A Y 23269
>
> Task Status of Volume site-private
> ------------------------------------------------------------------------------
> There are no active volume tasks
>
> 10.0.0.22 is web2, 10.0.0.153 is webw. It's irritating that gluster
> swaps out some of the hostnames with IPs on an intermittent random
> basis. Any way to fix this inconsistency?
>
>
> web1# tail -f /var/log/glusterfs/var-www-html-site.example.com-sites-default-private.log
>
> [2015-02-25 00:49:14.294562] I [MSGID: 100030]
> [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running
> /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs
> --volfile-server=localhost --volfile-id=/site-private
> /var/www/html/site.example.com/sites/default/private)
> [2015-02-25 00:49:14.303008] E
> [glusterfsd-mgmt.c:1494:mgmt_getspec_cbk] 0-glusterfs: failed to get
> the 'volume file' from server
> [2015-02-25 00:49:14.303153] E
> [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 0-mgmt: failed to fetch
> volume file (key:/site-private)
> [2015-02-25 00:49:14.303595] W [glusterfsd.c:1194:cleanup_and_exit]
> (--> 0-: received signum (0), shutting down
> [2015-02-25 00:49:14.303673] I [fuse-bridge.c:5599:fini] 0-fuse:
> Unmounting '/var/www/html/site.example.com/sites/default/private'.
>
> These lines appear in
> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log about every 5
> seconds:
> [2015-02-25 01:00:04.312532] W [socket.c:611:__socket_rwv]
> 0-management: readv on
> /var/run/0ecb037a7fd562bf0d7ed973ccd33ed8.socket failed (Invalid
> argument)
>
>
> Thanks in advance for your time/help. :)
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
More information about the Gluster-users
mailing list