[Gluster-users] Bitrot strange behavior

FNU Raghavendra Manjunath rabhat at redhat.com
Wed Apr 18 20:29:00 UTC 2018


Hi Cedric,

The 120 seconds is given to allow a window for things to settle. i.e.
imagine the following situation

1) open file (fd1 as file descriptor)
2) modify the file via fd1
3) close the file descriptor (fd1)
4) Again open the file (fd2)
5) modify

In the above set of operations, by the time bitrot daemon tries to
calculate the signature after 1st fd (fd1) is closed, active IO could be
happening again on the new file descriptor (fd2). And The signature
calculated might not be correct while active IO is happening.
So in gluster bitrot daemon waits for 120 seconds to sign the file after
all the file descriptors associated with that file are closed.

So with 120 seconds time what happens is, once all the file descriptors
associated with a file are closed (by the application), then a notification
is sent to bitrot daemon that a object (file to be precise with details
about that file) is modified. When all the file descriptors of a file are
closed a operation called "release" is received by the brick. So the brick
process sends a notification to bitrot daemon about a object (i.e. file)
when release operation is received on that file (means all the file
descriptors are closed). And the bitrot daemon waits for 120 seconds after
receiving the notice. And  before the file is signed (i.e. within the 120
seconds of wait time), if someone again opens it and modifies it, the brick
process will let the bit rot daemon know about it so that bitrot daemon
wont attempt to sign the file (as it is actively being modified).

The above value is configurable. And can be changed to some other value.
You can use the below command to change it to a different value

"gluster volume set <volume name> features.expiry-time <value>"

But as you said, currently the comparison of the signature by the scrubber
is local. i.e. while scrubbing, it calculates the checksum of the file,
compares with the stored checksum (as a extended attribute) to determine
whether the object is corrupted or not.
So yes, if the object is corrupted before the signing happens, then as of
now the scrubber does not have the mechanism to know that.

Regards,
Raghavendra


On Wed, Apr 18, 2018 at 2:20 PM, Cedric Lemarchand <yipikai7 at gmail.com>
wrote:

> Hi Sweta,
>
> Thanks, this drive me some more questions:
>
> 1. What is the reason of delaying signature creation ?
>
> 2. As a same file (replicated or dispersed) having different signature
> thought bricks is by definition an error, it would be good to triggered it
> during a scrub, or with a different tool. Is something like this planned ?
>
> Cheers
>
>> Cédric Lemarchand
>
> On 18 Apr 2018, at 07:53, Sweta Anandpara <sanandpa at redhat.com> wrote:
>
> Hi Cedric,
>
> Any file is picked up for signing by the bitd process after the
> predetermined wait of 120 seconds. This default value is captured in the
> volume option 'features.expiry-time' and is configurable - in your case, it
> can be set to 0 or 1.
>
> Point 2 is correct. A file corrupted before the bitrot signature is
> generated will not be successfully detected by the scrubber. That would
> require admin/manual intervention to explicitly heal the corrupted file.
>
> -Sweta
>
> On 04/16/2018 10:42 PM, Cedric Lemarchand wrote:
>
> Hello,
>
> I am playing around with the bitrot feature and have some questions:
>
> 1. when a file is created, the "trusted.bit-rot.signature” attribute
> seems only created approximatively 120 seconds after its creations
> (the cluster is idle and there is only one file living on it). Why ?
> Is there a way to make this attribute generated at the same time of
> the file creation ?
>
> 2. corrupting a file (adding a 0 locally on a brick) before the
> creation of the "trusted.bit-rot.signature” do not provide any
> warning: its signature is different than the 2 others copies on other
> bricks. Starting a scrub did not show up anything. I would think that
> Gluster compares signature between bricks for this particular use
> cases, but it seems the check is only local, so a file corrupted
> before it’s bitrot signature creation stay corrupted, and thus could
> be served to clients whith bad data ?
>
> Gluster 3.12.8 on Debian Stretch, bricks on ext4.
>
> Volume Name: vol1
> Type: Replicate
> Volume ID: 85ccfaf2-5793-46f2-bd20-3f823b0a2232
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster-01:/data/brick1
> Brick2: gluster-02:/data/brick2
> Brick3: gluster-03:/data/brick3
> Options Reconfigured:
> storage.build-pgfid: on
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
> features.bitrot: on
> features.scrub: Active
> features.scrub-throttle: aggressive
> features.scrub-freq: hourly
>
> Cheers,
>
> Cédric
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180418/f14f3365/attachment.html>


More information about the Gluster-users mailing list