[Gluster-users] mkdir produces stale file handles

Stefan Solbrig stefan.solbrig at ur.de
Thu Sep 19 10:03:14 UTC 2019


Thanks for the quick answer!

I think I can reduce data on the "full" bricks, solving the problem temporarily.

The thing is, that the behavior changed from 3.12 to 6.5:   3.12 didn't have problems with almost full bricks, so I thought everything was fine. Then, after the upgrade, I ran into this problem. This might be a corner case that will go away once no-one uses 3.12 any more.

But I think I can create a situation with 6.5 only that reproduces the error. Suppose I have a brick that 99% full.  So a write() will succeed. After the write, the brick can be 100% full, so a subsequent mkdir() will produce stale file handles (i.e., bricks that have different directory trees).  The funny thing is, that the mkdir() on the user side does not produce an error.   Clearly, no-one should ever let the file system get to 99%, but still, mkdir should fail then... 

What remains:  is there a recommended way how to deal with the situation that I have some bricks that don't have all directories?

best wishes,
Stefan

> That seems a gluster control.
> 
> Still, for me the issue is quite obvious - you are at 100% (or almost)  storage and you should rebalance your VMs.
> 
> Can you do a storage migration from the storage using gluster volumes at 100% to another storage whose gluster volumes are not so full (for example those that are 88% full)?
> 
> Best Regards,
> Strahil Nikolov
> 
> On Sep 19, 2019 11:43, Stefan Solbrig <stefan.solbrig at ur.de> wrote:
> Dear all,
> 
> I have a situation where "mkdir" on a client produces stale file handles.
> This happened after upgrading from 3.12 to 6.5
> 
> I believe I found the reason for it:
> 6.5 (but not 3.12) checks if there is space left on the device before doing a "mkdir", but calculates the "fullness" in percent.   In my situations I have bricks that seem 100% full although there is plenty space left on the device (several GBytes, see listing below).  In this situation, a "mkdir" is not performed on bricks that are 100% full, but the "mkdir" succeeds from a user perspective.  Then, doing a "ls" on the recently created directory leads to the message "stale file handle".
> 
> I believe the call sequence is more or less this:
> 
> server-rpc-fops.c:539:server_mkdir_cbk
> server-rpc-fops.c:2666:server_mkdir_resume
> server-rpc-fops.c:5242:server3_3_mkdir
> posix-entry-ops.c:625:posix_mkdir
> posix-helpers.c:2271
> 
> My questions are:
> * is it meant to operate in this way?
> * is there a built-in way to fix the inconsistent directories? 
> (I tried creating the missing directories on the bricks by hand, which seemed to fix the issue, but I'm not sure if this will introduce other problems.)
> 
> 
> The obvious (good) fix would be to redistribute the data such that the 100% full bricks will have enough free space. However, if a user writes a really large file, the problem can re-occur any time... 
> 
> best wishes,
> Stefan
> 
> 
> PS:
> File system listing.  Each file system is served as a brick, in a distribute-only system.
> 
> Filesystem                                       Size  Used Avail Use% Mounted on
> /dev/mapper/vgosb03pool06vd03-lvosb03pool06vd03   30T   27T  3.8T  88% /gl/lvosb03pool06vd03
> /dev/mapper/vgosb03pool06vd02-lvosb03pool06vd02   30T   27T  3.8T  88% /gl/lvosb03pool06vd02
> /dev/mapper/vgosb03pool06vd01-lvosb03pool06vd01   30T   27T  3.7T  88% /gl/lvosb03pool06vd01
> /dev/mapper/vgosb03pool01vd01-lvosb03pool01vd01   30T   30T  7.8G 100% /gl/lvosb03pool01vd01
> /dev/mapper/vgosb03pool01vd02-lvosb03pool01vd02   30T   30T   41G 100% /gl/lvosb03pool01vd02
> /dev/mapper/vgosb03pool01vd03-lvosb03pool01vd03   30T   29T  1.5T  96% /gl/lvosb03pool01vd03
> /dev/mapper/vgosb03pool01vd04-lvosb03pool01vd04   30T   30T   17G 100% /gl/lvosb03pool01vd04
> /dev/mapper/vgosb03pool02vd01-lvosb03pool02vd01   30T   30T   57G 100% /gl/lvosb03pool02vd01
> /dev/mapper/vgosb03pool02vd02-lvosb03pool02vd02   30T   30T   29G 100% /gl/lvosb03pool02vd02
> /dev/mapper/vgosb03pool02vd03-lvosb03pool02vd03   30T   30T   26G 100% /gl/lvosb03pool02vd03
> /dev/mapper/vgosb03pool02vd04-lvosb03pool02vd04   31T   31T  9.7G 100% /gl/lvosb03pool02vd04
> /dev/mapper/vgosb03pool03vd01-lvosb03pool03vd01   30T   30T   93G 100% /gl/lvosb03pool03vd01
> /dev/mapper/vgosb03pool03vd02-lvosb03pool03vd02   30T   30T   23G 100% /gl/lvosb03pool03vd02
> /dev/mapper/vgosb03pool03vd03-lvosb03pool03vd03   30T   30T  163G 100% /gl/lvosb03pool03vd03
> /dev/mapper/vgosb03pool03vd04-lvosb03pool03vd04   31T   30T  384G  99% /gl/lvosb03pool03vd04
> /dev/mapper/vgosb03pool04vd01-lvosb03pool04vd01   30T   29T  1.1T  97% /gl/lvosb03pool04vd01
> /dev/mapper/vgosb03pool04vd02-lvosb03pool04vd02   30T   27T  3.9T  88% /gl/lvosb03pool04vd02
> /dev/mapper/vgosb03pool04vd03-lvosb03pool04vd03   30T   29T  1.9T  94% /gl/lvosb03pool04vd03
> /dev/mapper/vgosb03pool04vd04-lvosb03pool04vd04   31T   29T  1.9T  94% /gl/lvosb03pool04vd04
> /dev/mapper/vgosb03pool05vd01-lvosb03pool05vd01   30T   28T  2.3T  93% /gl/lvosb03pool05vd01
> /dev/mapper/vgosb03pool05vd02-lvosb03pool05vd02   30T   27T  3.9T  88% /gl/lvosb03pool05vd02
> /dev/mapper/vgosb03pool05vd03-lvosb03pool05vd03   30T   27T  3.9T  88% /gl/lvosb03pool05vd03
> /dev/mapper/vgosb03pool05vd04-lvosb03pool05vd04   31T   27T  3.9T  88% /gl/lvosb03pool05vd04
> 
> 
> -- 
> Dr. Stefan Solbrig
> Universität Regensburg, Fakultät für Physik,
> 93040 Regensburg, Germany
> Tel +49-941-943-2097
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190919/e36227b7/attachment.html>


More information about the Gluster-users mailing list