[Gluster-users] "no gfid found" errors stall fix-layout
Pranith Kumar K
pranithk at gluster.com
Tue Jan 24 02:53:30 UTC 2012
On 01/23/2012 10:47 PM, Dan Bretherton wrote:
>
> On 01/18/2012 02:24 AM, Pranith Kumar K wrote:
>> On 01/17/2012 05:54 PM, Dan Bretherton wrote:
>>> Dear All-
>>> I have been having problems with rebalance ... self-heal again with
>>> Glusterfs version 3.2.5, this time related to "no gfid found"
>>> errors. A fix-layout operation has stalled because errors like the
>>> following are being reported for large number of files.
>>>
>>> [2012-01-17 10:48:02.138837] W
>>> [fuse-resolve.c:273:fuse_resolve_deep_cbk] 0-fuse:
>>> /users/mvc/WORK/ORCA1/ORCA1-MV01-DIMGPROC/RUNTMP_Exp61/ORCA1-MV01_2D_y2007m01d05.dimgproc.020:
>>> no gfid found
>>>
>>> I thought GFID errors were being fixed in version 3.2.5. How can I
>>> fix these errors to allow rebalance...fix-layout to run normally? I
>>> am also very worried that the lack of GFID entries for files and
>>> directories could stop file replication and other GlusterFS
>>> operations from working properly. All comments and suggestions
>>> would be much appreciated.
>>>
>>> Regards
>>> Dan.
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>> Dan,
>> We would like to reproduce this problem in house, could you
>> give more details on how to get into this situation.
>>
>> Pranith
>
> Hello Pranith,
> The errors were probably the result of a server that became
> unresponsive for a few hours and had to be restarted. When the server
> was not responding properly it was still showing as Connected in the
> output of "gluster peer status", but the load was growing quite large
> and it was impossible to log on. I restarted the server and triggered
> a self-heal operation on all the volumes in case any files had not
> been copied correctly to the unresponsive server. Later on I noticed
> some layout related error messages mentioning "anomalies", so I
> started a fix-layout operation to correct them. The fix-layout didn't
> complete the first time because of "no gfid found" errors, as I
> reported to the mailing list. A couple of days later I stopped
> fix-layout and started it again on another server, and that time it
> ran to completion. I then re-ran the self-heal operation and didn't
> find any new layout errors. I don't know if the second fix-layout
> attempt worked because it was performed on a different server, or if
> the "no gfid found" errors had been corrected automatically by
> GlusterFS in the days between the two fix-layout attempts. Either way
> I am very relieved, and I apologise for the false alarm. GlusterFS
> version 3.2.5 does appear to be able to correct GFID errors
> automatically, but this process can take a long time it seems.
>
> Regards
> -Dan.
Dan,
Thanks for the information. Our QE was able to reproduce no-gfid
errors bug in the lab. We are looking into the issue. If gfid is not
present it will automatically assign them. The issue is they should not
get into the no-gfid phase at all.
Pranith.
More information about the Gluster-users
mailing list