[Gluster-devel] Problems with graph switch in disperse

Fri Jan 2 15:29:28 UTC 2015

On 02.01.2015 05:45, Raghavendra G wrote: 

> On Wed, Dec 31, 2014
at 11:25 PM, Xavier Hernandez <xhernandez at datalab.es [2]> wrote:
> 
>>
On 27.12.2014 13:43, lidi at perabytes.com [1] wrote: 
>> 
>>> I tracked
this problem, and found that the loc.parent and loc.pargfid are all null
in the call sequences below:
>>> 
>>> ec_manager_writev() ->
ec_get_size_version() -> ec_lookup(). This can cause server_resolve()
return an EINVAL.
>>> 
>>> A replace-brick will cause all opened fd and
inode table recreate, but ec_lookup() get the loc from fd->_ctx. 
>>>

>>> So loc.parent and loc.pargfid are missing while fd changed. Other
xlators always do a lookup from root 
>>> 
>>> directory, so never cause
this problem. It seems that a recursive lookup from root directory may
address this 
>>> 
>>> issue.
>> 
>> EINVAL error is returned by
protocol/server when it tries to resolve an inode based on a loc. If
loc's 'name' field is not NULL nor empty, it tries to resolve the inode
based on /. The problem here is that pargfid is 00...00.
>> 
>> To solve
this issue I've modified ec_loc_setup_parent() so that it clears loc's
'name' if parent inode cannot be determined. This forces protocol/server
to resolve the inode based on , which is valid and can be resolved
successfully.
>> 
>> However this doesn't fully solve the bug. After
solving this issue, I get an EIO error. Further investigations seems to
indicate that this is caused by a locking problem caused by an incorrect
management of ESTALE when the brick is replaced.
> 
> ESTALE indicates
either any of the following situations:
> 
> 1. In the case of
named-lookup (loc containing /), is not present. Which means parent is
not present on the brick 
> 2. In the case of nameless lookup (loc
containing only of the file), file/directory represented by gfid is not
present on brick.
> 
> Which among the above two scenarios is your
case?

In this particular case, the problem is with the second scenario,
however there are other combinations that could lead to the first one.
Basically the root cause is that after replacing a brick, the new brick
is totally empty, so self heal needs to recover directory contents, but
some running operations may try to use gfid already resolved that the
new brick has never seen. In these cases the brick returns ESTALE, but
ec incorrectly handled this as a fatal error while trying to acquire a
lock, returning EIO for the full operation. 
I'll upload a patch to
solve this problem. 
Xavi 

Links:
------
[1]
mailto:lidi at perabytes.com
[2] mailto:xhernandez at datalab.es
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150102/dcb87071/attachment.html>