[Gluster-devel] Problems with graph switch in disperse

Xavier Hernandez xhernandez at datalab.es
Wed Dec 31 17:55:58 UTC 2014


 

On 27.12.2014 13:43, lidi at perabytes.com wrote: 

> I tracked this
problem, and found that the loc.parent and loc.pargfid are all null in
the call sequences below:
> 
> ec_manager_writev() ->
ec_get_size_version() -> ec_lookup(). This can cause server_resolve()
return an EINVAL.
> 
> A replace-brick will cause all opened fd and
inode table recreate, but ec_lookup() get the loc from fd->_ctx. 
> 
>
So loc.parent and loc.pargfid are missing while fd changed. Other
xlators always do a lookup from root 
> 
> directory, so never cause
this problem. It seems that a recursive lookup from root directory may
address this 
> 
> issue.

EINVAL error is returned by protocol/server
when it tries to resolve an inode based on a loc. If loc's 'name' field
is not NULL nor empty, it tries to resolve the inode based on
<pargfid>/<name>. The problem here is that pargfid is 00...00.

To solve
this issue I've modified ec_loc_setup_parent() so that it clears loc's
'name' if parent inode cannot be determined. This forces protocol/server
to resolve the inode based on <gfid>, which is valid and can be resolved
successfully.

However this doesn't fully solve the bug. After solving
this issue, I get an EIO error. Further investigations seems to indicate
that this is caused by a locking problem caused by an incorrect
management of ESTALE when the brick is replaced. I'll upload a patch
shortly to solve these issues.

Xavi 

> ----- 原邮件信息 -----
>
发件人:Raghavendra Gowdappa 
> 发送时间:14-12-24 21:48:56
> 收件人:Xavier
Hernandez 
> 抄送人:Gluster Devel 
> 主题:Re: [Gluster-devel] Problems with
graph switch in disperse
> 
> Do you know the origins of EIO?
fuse-bridge only fails a lookup fop with EIO (when NULL gfid is received
in a successful lookup reply). So, there might be other xlator which is
sending EIO.
> 
> ----- Original Message -----
> > From: "Xavier
Hernandez" 
> > To: "Gluster Devel" 
> > Sent: Wednesday, December 24,
2014 6:25:17 PM
> > Subject: [Gluster-devel] Problems with graph switch
in disperse
> > 
> > Hi,
> > 
> > I'm experiencing a problem when
gluster graph is changed as a result of
> > a replace-brick operation
(probably with any other operation that
> > changes the graph) while the
client is also doing other tasks, like
> > writing a file.
> > 
> > When
operation starts, I see that the replaced brick is disconnected,
> > but
writes continue working normally with one brick less.
> > 
> > At some
point, another graph is created and comes online. Remaining
> > bricks
on the old graph are disconnected and the old graph is destroyed.
> > I
see how new write requests are sent to the new graph.
> > 
> > This
seems correct. However there's a point where I see this:
> > 
> >
[2014-12-24 11:29:58.541130] T [fuse-bridge.c:2305:fuse_write_resume]
>
> 0-glusterfs-fuse: 2234: WRITE (0x16dcf3c, size=131072,
offset=255721472)
> > [2014-12-24 11:29:58.541156] T
[ec-helpers.c:101:ec_trace] 2-ec:
> > WIND(INODELK)
0x7f8921b7a9a4(0x7f8921b78e14) [refs=5, winds=3, jobs=1]
> >
frame=0x7f8932e92c38/0x7f8932e9e6b0, min/exp=3/3, err=0 state=1
> >
{111:000:000} idx=0
> > [2014-12-24 11:29:58.541292] T
[rpc-clnt.c:1384:rpc_clnt_record]
> > 2-patchy-client-0: Auth Info: pid:
0, uid: 0, gid: 0, owner:
> > d025e932897f0000
> > [2014-12-24
11:29:58.541296] T [io-cache.c:133:ioc_inode_flush]
> >
2-patchy-io-cache: locked inode(0x16d2810)
> > [2014-12-24
11:29:58.541354] T
> > [rpc-clnt.c:1241:rpc_clnt_record_build_header]
2-rpc-clnt: Request
> > fraglen 152, payload: 84, rpc hdr: 68
> >
[2014-12-24 11:29:58.541408] T [io-cache.c:137:ioc_inode_flush]
> >
2-patchy-io-cache: unlocked inode(0x16d2810)
> > [2014-12-24
11:29:58.541493] T [io-cache.c:133:ioc_inode_flush]
> >
2-patchy-io-cache: locked inode(0x16d2810)
> > [2014-12-24
11:29:58.541536] T [io-cache.c:137:ioc_inode_flush]
> >
2-patchy-io-cache: unlocked inode(0x16d2810)
> > [2014-12-24
11:29:58.541537] T [rpc-clnt.c:1577:rpc_clnt_submit]
> > 2-rpc-clnt:
submitted request (XID: 0x17 Program: GlusterFS 3.3,
> > ProgVers: 330,
Proc: 29) to rpc-transport (patchy-client-0)
> > [2014-12-24
11:29:58.541646] W [fuse-bridge.c:2271:fuse_writev_cbk]
> >
0-glusterfs-fuse: 2234: WRITE => -1 (Input/output error)
> > 
> > It
seems that fuse still has a write request pending for graph 0. It is
> >
resumed but it returns EIO without calling the xlator stack
(operations
> > seen between the two log messages are from other
operations and they are
> > sent to graph 2). I'm not sure why this
happens and how I should aviod this.
> > 
> > I tried the same scenario
with replicate and it seems to work, so there
> > must be something
wrong in disperse, but I don't see where the problem
> > could be.
> >

> > Any ideas ?
> > 
> > Thanks,
> > 
> > Xavi
> >
_______________________________________________
> > Gluster-devel
mailing list
> > Gluster-devel at gluster.org
> >
http://www.gluster.org/mailman/listinfo/gluster-devel [1]
> > 
>
_______________________________________________
> Gluster-devel mailing
list
> Gluster-devel at gluster.org
>
http://www.gluster.org/mailman/listinfo/gluster-devel [2]



Links:
------
[1]
http://www.gluster.org/mailman/listinfo/gluster-devel
[2]
http://www.gluster.org/mailman/listinfo/gluster-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20141231/d98812a1/attachment.html>


More information about the Gluster-devel mailing list