[Gluster-devel] EAGAIN/EBUSY handling in glusterfs

Shishir Gowda sgowda at redhat.com
Wed Jan 23 09:34:30 UTC 2013

Hi Avati,

One of the possible scenarios is someone taking a lvm snap of the backend.

few eg:
DHT's rebalance: we would not retry a migration if case we got an error EAGAIN or even EINTR.
Does self-heal retry healing if the error was EAGAIN or EINTR?

These are just few I can think about.

When snap feature becomes supported (refer to wiki link in previous page), few ops' would be blocked while snap is in progress.

If we decide to provide complete snap in the future (not just crash-consistent), then in all probability all fops will be blocked.

Do we guarantee all op's(triggered internally) that fail will be re-triggered? Or are we guaranteeing a state from which we can recover completely?

With regards,

----- Original Message -----
From: "Anand Avati" <anand.avati at gmail.com>
To: "Shishir Gowda" <sgowda at redhat.com>
Cc: gluster-devel at nongnu.org
Sent: Wednesday, January 23, 2013 1:23:09 PM
Subject: Re: [Gluster-devel] EAGAIN/EBUSY handling in glusterfs

On Tue, Jan 22, 2013 at 10:39 PM, Shishir Gowda < sgowda at redhat.com > wrote: 

Hi All, 

Currently I see that almost all the xlators in glusterfs do not handle EAGAIN/EBUSY errors. 

Though this should be handled by the applications, 

If by "handle by application" you meant "handled by retrying syscall by application", that is not completely true. More generally it is true for EINTR, and some places for EAGAIN (i.e when used on non-blocking pollable file descriptors like sockets - which specifically does NOT include filesystem for regular read/write). EBUSY almost always does not suggest a poll/retry to the application. 

there are multiple paths where the op's are not performed by the applications (but are internal to glusterfs). 

Few of these are 
a. Rebalance 
b. Replace brick 
c. Self-heal 
d. lk's 

With the proposed snap feature ( http://www.gluster.org/community/documentation/index.php/Features/snapshot ), would it not be better to identify such op's inside glusterfs? 

Can you explain more on that? Why is that necessary? 


Irrespective of the snap feature, I think it is about correctness to handle EAGAIN/EBUSY in these code paths. 

Please comment. 

With regards, 

Gluster-devel mailing list 
Gluster-devel at nongnu.org 

More information about the Gluster-devel mailing list