[Gluster-users] Alert: GlusterFS 3.2.2 Release for GFID Mismatch

John Mark Walker jwalker at gluster.com
Thu Jul 14 17:36:20 UTC 2011


GlusterFS Alert -

Problem: GFID Mismatch

Severity: 7 (out of 10) - Loss of service but ultimately no loss of data

PREVENTION: To *prevent* the issue, please install GlusterFS 3.2.2. If you're using 3.1.x, upgrade to 3.1.5.
Download 3.2.2 here: http://download.gluster.com/pub/gluster/glusterfs/LATEST/
Download 3.1.5 here: http://download.gluster.com/pub/gluster/glusterfs/3.1/LATEST/

FIX:
To check for mismatched GFIDs, please review your client logs and grep for the words:
“gfid different”  or “gfid differs”

If you see either of these conditions, simply upgrading will not fix the problem. You will need to use our tools here: https://github.com/vikasgorur/gfid
See details below for instructions. Upgrading will not fix the issue if you've already experienced GFID mismatches.


DETAILS:
Over the last 3 weeks we have seen a growing number of GlusterFS implementations experiencing an issue where mismatched GFIDs are appearing within the filesystem.

Each file/directory on a Gluster volume has a unique 128-bit number associated with it called the GFID. This is true regardless of Gluster configuration (distribute or distribute/replicate). One inode, one GFID. The GFID is stored on the backend as the value of the extended attribute "trusted.gfid". Under normal circumstances, the value of this attribute is the same on all the backend bricks. However, certain conditions can cause the value on one or more of the bricks to differ from that on the other bricks. This causes the GlusterFS client to become confused and throw errors. This applies to both the 3.1.4 and 3.2.1 versions of the filesystem, and previous versions in those series.  This can happen with the Native GlusterFS, NFS, or CIFS.

PREVENTION:
To prevent this issue from occurring, please upgrade immediately to 3.1.5, or 3.2.2. This will not correct the issue should it already be present in your cluster.

FIX:
***IMPORTANT***
To check for mismatched GFIDs, please review your client logs and grep for the words:
“gfid different”  or “gfid differs”

If you see either of these conditions, simply upgrading will not fix the problem. You will need to download tools here: https://github.com/vikasgorur/gfid

Follow the instructions in the README:
https://github.com/vikasgorur/gfid/blob/master/README

Here's the quick-start version:


1. The first step is to construct the master list of all files:

# cd /export/brick1
# find . > brick1.txt
... (do for all bricks)

# cat brick1.txt brick2.txt... | sort -u > master_list.txt

2. Then we need to get the gfid's of all the inodes from these bricks:

# cd /export/brick1
# gfid-list /path/to/master_list.txt > brick1.gfid
... (do for all bricks)


3. Identify the mismatched inodes:

# gfid-mismatch brick1.gfid brick2.gfid brick3.gfid brick4.gfid

4. Delete the gfid's now by doing:

# gluster volume stop <affected volume>
# gfid-mismatch brick1.gfid brick2.gfid brick3.gfid brick4.gfid | cut -f1 -d: > mismatched.txt
# cd /export/brick1
# gfid-delete /path/to/mismatched.txt

Repeat for the other bricks.

5. Check logs

'gfid-delete' will produce a log with one entry for each file, which is either:

usr/bin/factor: removed OK
        OR
usr/bin/vim: No such file or directory

IMPORTANT NOTE: The deletion of gfid's must be done ONLY ON A STOPPED VOLUME.
Deleting the gfid's on a running volume with mounted clients will cause more
problems instead of solving them.

Please feel free to contact me directly with any questions.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110714/af123216/attachment.html>


More information about the Gluster-users mailing list