[Gluster-users] "mismatching layouts" errors after expanding volume

Thu Feb 23 13:58:27 UTC 2012

Thanks Jeff, that's interesting.

It is reassuring to know that these errors are self repairing.  That 
does appear to be happening, but only when I run "find -print0 | xargs 
--null stat >/dev/null" in affected directories.  I will run that 
self-heal on the whole volume as well, but I have had to start with 
specific directories that people want to work in today.  Does repeating 
the fix-layout operation have any effect, or are the xattr repairs all 
done by the self-heal mechanism?

I have found the cause of the transient brick failure; it happened again 
this morning on a replicated pair of bricks.  Suddenly the 
etc-glusterfs-glusterd.vol.log file was flooded with these messages 
every few seconds.

E [socket.c:2080:socket_connect] 0-management: connection attempt failed 
(Connection refused)

One of the clients then reported errors like the following.

[2012-02-23 11:19:22.922785] E [afr-common.c:3164:afr_notify] 
2-atmos-replicate-3: All subvolumes are down. Going offline until 
atleast one of them comes back up.
[2012-02-23 11:19:22.923682] I [dht-layout.c:581:dht_layout_normalize] 
0-atmos-dht: found anomalies in /. holes=1 overlaps=0
[2012-02-23 11:19:22.923714] I 
[dht-selfheal.c:569:dht_selfheal_directory] 0-atmos-dht: 1 subvolumes 
down -- not fixing

[2012-02-23 11:19:22.941468] W 
[socket.c:1494:__socket_proto_state_machine] 1-atmos-client-7: reading 
from socket failed. Error (Transport endpoint is not connected), peer 
(192.171.166.89:24019)
[2012-02-23 11:19:22.972307] I [client.c:1883:client_rpc_notify] 
1-atmos-client-7: disconnected
[2012-02-23 11:19:22.972352] E [afr-common.c:3164:afr_notify] 
1-atmos-replicate-3: All subvolumes are down. Going offline until 
atleast one of them comes back up.

The servers causing trouble were still showing as Connected in "gluster 
peer status" and nothing appeared to be wrong except for glusterd 
misbehaving.  Restarting glusterd solved the problem, but given that 
this has happened twice this week already I am worried that it could 
happen again at any time.  Do you know what might be causing glusterd to 
stop responding like this?

Regards
Dan.

On 02/22/2012 08:00 PM, gluster-users-request at gluster.org wrote:
> Date: Wed, 22 Feb 2012 10:32:31 -0500
> From: Jeff Darcy<jdarcy at redhat.com>
> Subject: Re: [Gluster-users] "mismatching layouts" errors after
> 	expanding volume
> To:gluster-users at gluster.org
> Message-ID:<4F450A8F.6070809 at redhat.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Following up on the previous reply...
>
> On 02/22/2012 02:52 AM, Dan Bretherton wrote:
>> >  [2012-02-16 22:59:42.504907] I
>> >  [dht-layout.c:682:dht_layout_dir_mismatch] 0-atmos-dht: subvol:
>> >  atmos-replicate-0; inode layout - 0 - 0; disk layout - 9203501
>> >  34 - 1227133511
>> >  [2012-02-16 22:59:42.534399] I [dht-common.c:524:dht_revalidate_cbk]
>> >  0-atmos-dht: mismatching layouts for /users/rle/TRACKTEMP/TRACKS
> On 02/22/2012 09:19 AM, Jeff Darcy wrote:
>> >  OTOH, the log entries below do seem to indicate that there's something going on
>> >  that I don't understand.  I'll dig a bit, and let you know if I find anything
>> >  to change my mind wrt the safety of restoring write access.
> The two messages above are paired, in the sense that the second is inevitable
> after the first. The "disk layout" range shown in the first is exactly what I
> would expect for subvolume 3 out of 0-13. That means the trusted.glusterfs.dht
> value on disk seems reasonable. The corresponding in-memory "inode layout"
> entry has the less reasonable value of all zero. That probably means we failed
> to fetch the xattr at some point in the past. There might be something earlier
> in your logs - perhaps a message about "holes" and/or one specifically
> mentioning that subvolume - to explain why.
>
> The good news is that this should be self-repairing. Once we get these
> messages, we try to re-fetch the layout information from all subvolumes. If
> *that*  failed, we'd see more messages than those above. Since the on-disk
> values seem OK and revalidation seems to be succeeding, I would say these
> messages probably represent successful attempts to recover from a transient
> brick failure, and that does*not*  change what I said previously.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120223/58b2f596/attachment.html>