[Gluster-users] Inconsistent md5sum of replicated file

Anthony Delviscio adelviscio at gmail.com
Thu Sep 8 05:23:30 UTC 2011


Pranith, the md5sum mismatch was noticed Tuesday (9/6) morning via an
automated process that verifies file checksum consistency between what is
stored on Gluster and the source.  However, the file was written to Gluster
on 8/31.  We're not sure if something happened between 8/31 and 9/6 that
would make the replicas inconsistent.  It was only after looking at the file
via backend storage on Tuesday we noticed that there was an inconsistency
between the two replica copies of the file.

What is odd is that the fattr and stat data of the two replica files are
identical.  However, when we try to unzip the file, the file that gives us
the incorrect md5sum also gives us an error when unzipping which makes the
replica file unusable.  I'm just curious to know why Gluster can't see the
difference between the two files.  Gzip isn't capable of reading the file
with the erroneous checksum but according to the fattr and stat data,
they're the same file when compared to the good copy.

All of our processes read/write/modify file i/o through the Gluster mount
point whether its via glusterfs or NFS - never through the backend storage
with the exception of recent non-intrusive troubleshooting (getfattr and
stat) of backend storage.

Thank you

On Wed, Sep 7, 2011 at 10:45 PM, Pranith Kumar K <pranithk at gluster.com>wrote:

> **
> hi Anthony,
>       Thanks for the outputs. Nothing suspicious. When did you notice that
> the md5sums are not matching? As soon as it is created or something happened
> before the file ended up in this situation.
>
> Pranith.
>
> On 09/08/2011 02:10 AM, Anthony Delviscio wrote:
>
> Pranith, thank you for clarifying.
>
> The folder was moved earlier today but the issue of md5sums of the file
> being different still persists.
> The files are compressed and when decompressing the files, the file with
> the correct (or expected md5sum) decompresses without error.  However, the
> file with the incorrect mdsum doesn't decompress properly and cites crc and
> length errors.
>
> The pastie output of stat and fattr of the directory housing the file can
> be found here:
> http://pastie.org/2499242
>
> Thank you
> Anthony
>
>  On Wed, Sep 7, 2011 at 1:24 PM, Pranith Kumar K <pranithk at gluster.com>wrote:
>
>>  hi Anthony,
>>     Parent directory is directory that contains the file.
>>
>> Pranith
>>
>>
>> On 09/07/2011 08:18 PM, Anthony Delviscio wrote:
>>
>> Pranith, by parent directory, do you mean the directory that contains the
>> file or the top level directory of the brick?
>>
>>
>>
>> My gluster volume info:
>>
>> http://pastie.org/2493045
>>
>> The hostnames used in the gluster volume info are DNS hostnames that
>> resolve to 10GB interfaces on the Gluster nodes.
>>
>> The hostname used in the mount options is a RR DNS hostname that resolves
>> to all eight Gluster nodes.
>>
>>
>>
>> Stat/md5sum/getfattr data of the identical files with different md5sums.
>>
>> http://pastie.org/2497461
>>
>>
>>
>> Thank you
>>
>>
>> On Wed, Sep 7, 2011 at 4:43 AM, Pranith Kumar K <pranithk at gluster.com>wrote:
>>
>>>  hi Anthony,
>>>       Could you send the output of the getfattr -d -m . -e hex <filepath>
>>> on both the bricks and also the stat output on the both the backends. Give
>>> the outputs for its parent directory also.
>>>
>>> Pranith.
>>>
>>>
>>> On 09/07/2011 04:22 AM, Anthony Delviscio wrote:
>>>
>>>   I was wondering if anyone would be able to shed some light on how a
>>> file could end up with inconsistent md5sums on Gluster backend storage.
>>>
>>>
>>>
>>> Our configuration is running on Gluster v3.1.5 in a distribute-replicate
>>> setup consisting of 8 bricks.
>>>
>>> Our OS is Red Hat 5.6 x86_64.  Backend storage is an ext3 RAID 5.
>>>
>>>
>>>
>>> The 8 bricks are in RR DNS and are mounted for reading/writing via NFS
>>> automounts.
>>>
>>>
>>>
>>> When comparing md5sums of the file from two different NFS clients, they
>>> were different.
>>>
>>>
>>>
>>> The extended attributes of the files on backend storage are identical.  The
>>> file size and permissions are identical.  The stat data (excluding inode
>>> on backend storage file system) is identical.
>>>
>>> However, running md5sum on the two files, results in two different
>>> md5sums.
>>>
>>>
>>>
>>> Copying both files to another location/server and running the md5sum also
>>> results in no change – they’re still different.
>>>
>>>
>>>
>>> Gluster logs do not show anything related to the filename in question.  Triggering
>>> a self-healing operation didn’t seem to do anything and it may have to do
>>> with the fact that the extended attributes are identical.
>>>
>>>
>>>
>>> If more information is required, let me know and I will try to
>>> accommodate.
>>>
>>> Thank you
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing listGluster-users at gluster.orghttp://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110908/1711555f/attachment.html>


More information about the Gluster-users mailing list