[Bugs] [Bug 1734251] New: Files inaccessible if one rebalance process is killed in a multinode volume
bugzilla at redhat.com
bugzilla at redhat.com
Tue Jul 30 04:56:58 UTC 2019
Bug ID: 1734251
Summary: Files inaccessible if one rebalance process is killed
in a multinode volume
Assignee: bugs at gluster.org
Reporter: nbalacha at redhat.com
CC: atumball at redhat.com, bugs at gluster.org
Depends On: 1711764
Target Milestone: ---
+++ This bug was initially created as a clone of Bug #1711764 +++
Description of problem:
This is a consequence of https://review.gluster.org/#/c/glusterfs/+/17239/ and
lookup-optimize being enabled.
Rebalance directory processing steps on each node:
1. Set new layout on directory without the commit hash
2. List files on that local subvol. Migrate those files which fall into its
bucket. Lookups are performed on the files only if it is determined that it is
to be migrated by the process.
3. When done, update the layout on the local subvol with the layout containing
the commit hash.
When there are multiple rebalance processes processing the same directory, they
finish at different times and one process can update the layout with the commit
hash before the others are done listing and migrating their files.
Clients will therefore see a complete layout even before all files have been
looked up according to the new layout causing file access to fail.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Create a 2x2 volume spanning 2 nodes. Create some directories and files on
2. Add 2 bricks to convert it to a 3x2 volume.
3. Start a rebalance on the volume and break into one rebalance process before
it starts processing the directories.
4. Allow the second rebalance process to complete. Kill the process that is
blocked by gdb.
5. Mount the volume and try to stat the files without listing the directories.
The stat will fail for several files with the error :
stat: cannot stat ‘<filename>’: No such file or directory
--- Additional comment from Nithya Balachandran on 2019-05-20 05:05:30 UTC ---
The easiest solution is to have each node do the file lookups before the call
Cons: Will introduce more lookups but is pretty much the same as the number
seen before https://review.gluster.org/#/c/glusterfs/+/17239/
--- Additional comment from Worker Ant on 2019-05-20 10:01:20 UTC ---
REVIEW: https://review.gluster.org/22746 (cluster/dht: Lookup all files when
processing directory) posted (#1) for review on master by N Balachandran
[Bug 1711764] Files inaccessible if one rebalance process is killed in a
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs