[Gluster-users] volume not working after yum update - gluster 3.6.3
Kingsley
gluster at gluster.dogwind.com
Mon Aug 10 13:49:21 UTC 2015
Further to this, the volume doesn't seem overly healthy. Any idea how I
can get it back into a working state?
Trying to access one particular directory on the clients just hangs. If
I query heal info, that directory appears in the output as possibly
undergoing heal (actual directory name changed as it's private info):
[root at gluster1b-1 ~]# gluster volume heal callrec info
Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/
<gfid:164f888f-2049-49e6-ad26-c758ee091863>
/recordings/834723/14391 - Possibly undergoing heal
<gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f>
<gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e>
<gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c>
<gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb>
<gfid:650efeca-b45c-413b-acc3-f0a5853ccebd>
Number of entries: 7
Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/
Number of entries: 0
Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/
<gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f>
<gfid:164f888f-2049-49e6-ad26-c758ee091863>
<gfid:650efeca-b45c-413b-acc3-f0a5853ccebd>
<gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e>
/recordings/834723/14391 - Possibly undergoing heal
<gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c>
<gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb>
Number of entries: 7
Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/
Number of entries: 0
If I query each brick directly for the number of files/directories
within that, I get 1731 on gluster1a-1 and gluster2a-1, but 1737 on the
other two, using this command:
# find /data/brick/callrec/recordings/834723/14391 -print | wc -l
Cheers,
Kingsley.
On Mon, 2015-08-10 at 11:05 +0100, Kingsley wrote:
> Sorry for the blind panic - restarting the volume seems to have fixed
> it.
>
> But then my next question - why is this necessary? Surely it undermines
> the whole point of a high availability system?
>
> Cheers,
> Kingsley.
>
> On Mon, 2015-08-10 at 10:53 +0100, Kingsley wrote:
> > Hi,
> >
> > We have a 4 way replicated volume using gluster 3.6.3 on CentOS 7.
> >
> > Over the weekend I did a yum update on each of the bricks in turn, but
> > now when clients (using fuse mounts) try to access the volume, it hangs.
> > Gluster itself wasn't updated (we've disabled that repo so that we keep
> > to 3.6.3 for now).
> >
> > This was what I did:
> >
> > * on first brick, "yum update"
> > * reboot brick
> > * watch "gluster volume status" on another brick and wait for it
> > to say all 4 bricks are online before proceeding to update the
> > next brick
> >
> > I was expecting the clients might pause 30 seconds while they notice a
> > brick is offline, but then recover.
> >
> > I've tried re-mounting clients, but that hasn't helped.
> >
> > I can't see much data in any of the log files.
> >
> > I've tried "gluster volume heal callrec" but it doesn't seem to have
> > helped.
> >
> > What shall I do next?
> >
> > I've pasted some stuff below in case any of it helps.
> >
> > Cheers,
> > Kingsley.
> >
> > [root at gluster1b-1 ~]# gluster volume info callrec
> >
> > Volume Name: callrec
> > Type: Replicate
> > Volume ID: a39830b7-eddb-4061-b381-39411274131a
> > Status: Started
> > Number of Bricks: 1 x 4 = 4
> > Transport-type: tcp
> > Bricks:
> > Brick1: gluster1a-1:/data/brick/callrec
> > Brick2: gluster1b-1:/data/brick/callrec
> > Brick3: gluster2a-1:/data/brick/callrec
> > Brick4: gluster2b-1:/data/brick/callrec
> > Options Reconfigured:
> > performance.flush-behind: off
> > [root at gluster1b-1 ~]#
> >
> >
> > [root at gluster1b-1 ~]# gluster volume status callrec
> > Status of volume: callrec
> > Gluster process Port Online Pid
> > ------------------------------------------------------------------------------
> > Brick gluster1a-1:/data/brick/callrec 49153 Y 6803
> > Brick gluster1b-1:/data/brick/callrec 49153 Y 2614
> > Brick gluster2a-1:/data/brick/callrec 49153 Y 2645
> > Brick gluster2b-1:/data/brick/callrec 49153 Y 4325
> > NFS Server on localhost 2049 Y 2769
> > Self-heal Daemon on localhost N/A Y 2789
> > NFS Server on gluster2a-1 2049 Y 2857
> > Self-heal Daemon on gluster2a-1 N/A Y 2814
> > NFS Server on 88.151.41.100 2049 Y 6833
> > Self-heal Daemon on 88.151.41.100 N/A Y 6824
> > NFS Server on gluster2b-1 2049 Y 4428
> > Self-heal Daemon on gluster2b-1 N/A Y 4387
> >
> > Task Status of Volume callrec
> > ------------------------------------------------------------------------------
> > There are no active volume tasks
> >
> > [root at gluster1b-1 ~]#
> >
> >
> > [root at gluster1b-1 ~]# gluster volume heal callrec info
> > Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/
> > /to_process - Possibly undergoing heal
> >
> > Number of entries: 1
> >
> > Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/
> > Number of entries: 0
> >
> > Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/
> > /to_process - Possibly undergoing heal
> >
> > Number of entries: 1
> >
> > Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/
> > Number of entries: 0
> >
> > [root at gluster1b-1 ~]#
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
More information about the Gluster-users
mailing list