[Bugs] [Bug 1654370] New: Bitrot: Scrub status say file is corrupted even it was just created AND 'path' in the output is broken

bugzilla at redhat.com bugzilla at redhat.com
Wed Nov 28 15:52:29 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1654370

            Bug ID: 1654370
           Summary: Bitrot: Scrub status say file is corrupted even it was
                    just created AND 'path' in the output is broken
           Product: GlusterFS
           Version: 5
         Component: bitrot
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: david.spisla at iternity.com
                CC: bugs at gluster.org
      Docs Contact: bugs at gluster.org



Description of problem:

'gluster vo bitrot <volume> scrub status' show file is corrupted even it was
just created. Also the path of that file is broken in the output. There is the
suffix "rusted.gfid" added to the files name. "trusted.gfid" is the name of an
extended attributes. See below for details


Version-Release number of selected component (if applicable): 
Gluster v5.1


How reproducible:

Write a file in a freshly created FUSE Mount of a volume. Wait until it was
marked and do "scrub ondemand" and "scrub status" . Do it with a second file
and, as you can see, boths files are affected and marked as corrupted. This
seems to be very unlikely.

Steps to Reproduce:

1. fs-davids-c1-n1:# echo file1 >> /gluster/archives/archive1/data/file1.txt

2. fs-davids-c1-n1:# getfattr -d -m ""
/gluster/brick1/glusterbrick/data/file1.txt
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/glusterbrick/data/file1.txt
trusted.afr.dirty=0sAAAAAAAAAAAAAAAA
trusted.bit-rot.signature=0sAQIAAAAAAAAA7NxVNvc72uiBbw6kBybvXpuBDZFEkwdZA7uQYj2Xsdg=
trusted.bit-rot.version=0sAgAAAAAAAABb/qsZAAmLDQ==
trusted.gfid=0sXR/JcssRRYaslDCny8BpDw==
trusted.gfid2path.b7b5820f548c9129="f49da84b-d63c-40f5-9ed4-e66a262c671b/file1.txt"
trusted.glusterfs.mdata=0sAQAAAAAAAAAAAAAAAFv+rY0AAAAAJDHeIQAAAABb/q2NAAAAACQx3iEAAAAAW/6tjQAAAAAaza+K
trusted.pgfid.f49da84b-d63c-40f5-9ed4-e66a262c671b=0sAAAAAQ==
trusted.start_time="1543417229"
trusted.worm_file=0sMQA=

3. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub status

Volume name : archive1

State of scrub: Active (Idle)

Scrub impact: lazy

Scrub frequency: daily

Bitrot error log location: /var/log/glusterfs/bitd.log

Scrubber error log location: /var/log/glusterfs/scrub.log


=========================================================

Node: localhost

Number of Scrubbed files: 0

Number of Skipped files: 0

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0


=========================================================

Node: fs-davids-c1-n2

Number of Scrubbed files: 0

Number of Skipped files: 0

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0


=========================================================

Node: fs-davids-c1-n4

Number of Scrubbed files: 0

Number of Skipped files: 0

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0


=========================================================

Node: fs-davids-c1-n3

Number of Scrubbed files: 0

Number of Skipped files: 0

Last completed scrub time: Scrubber pending to complete.

Duration of last scrub (D:M:H:M:S): 0:0:0:0

Error count: 0

=========================================================

4. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub ondemand
volume bitrot: scrubber started ondemand for volume archive1
5. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub status

Volume name : archive1

State of scrub: Active (Idle)

Scrub impact: lazy

Scrub frequency: daily

Bitrot error log location: /var/log/glusterfs/bitd.log

Scrubber error log location: /var/log/glusterfs/scrub.log


=========================================================

Node: localhost

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n2

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n4

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n3

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

=========================================================

6. fs-davids-c1-n1:# echo file2 >> /gluster/archives/archive1/data/file2.txt
7. fs-davids-c1-n1:# getfattr -d -m ""
/gluster/brick1/glusterbrick/data/file2.txt
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick1/glusterbrick/data/file2.txt
trusted.afr.dirty=0sAAAAAAAAAAAAAAAA
trusted.bit-rot.signature=0sAQIAAAAAAAAAZ+5UeOqtsDS6WZROuXd5e0nKaqjTV0WH8268vutl9w4=
trusted.bit-rot.version=0sAgAAAAAAAABb/qsZAAmLDQ==
trusted.gfid=0szJ3EL9TvSseLHkZOmdw3HQ==
trusted.gfid2path.d3c3e99ef3917352="f49da84b-d63c-40f5-9ed4-e66a262c671b/file2.txt"
trusted.glusterfs.mdata=0sAQAAAAAAAAAAAAAAAFv+sTIAAAAAN+ztgwAAAABb/rEyAAAAADfs7YMAAAAAW/6xMgAAAAA3uxck
trusted.pgfid.f49da84b-d63c-40f5-9ed4-e66a262c671b=0sAAAAAQ==
trusted.start_time="1543418162"
trusted.worm_file=0sMQA=

8. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub status

Volume name : archive1

State of scrub: Active (Idle)

Scrub impact: lazy

Scrub frequency: daily

Bitrot error log location: /var/log/glusterfs/bitd.log

Scrubber error log location: /var/log/glusterfs/scrub.log


=========================================================

Node: localhost

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n2

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n4

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n3

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:07:53

Duration of last scrub (D:M:H:M:S): 0:0:0:2

Error count: 1

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

=========================================================

9. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub ondemand
volume bitrot: scrubber started ondemand for volume archive1
10. fs-davids-c1-n1:# gluster vo bitrot archive1 scrub status
Volume name : archive1

State of scrub: Active (Idle)

Scrub impact: lazy

Scrub frequency: daily

Bitrot error log location: /var/log/glusterfs/bitd.log

Scrubber error log location: /var/log/glusterfs/scrub.log


=========================================================

Node: localhost

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:19:06

Duration of last scrub (D:M:H:M:S): 0:0:0:3

Error count: 2

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

cc9dc42f-d4ef-4ac7-8b1e-464e99dc371d ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file2.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n3

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:19:06

Duration of last scrub (D:M:H:M:S): 0:0:0:3

Error count: 2

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

cc9dc42f-d4ef-4ac7-8b1e-464e99dc371d ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file2.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n2

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:19:06

Duration of last scrub (D:M:H:M:S): 0:0:0:3

Error count: 2

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

cc9dc42f-d4ef-4ac7-8b1e-464e99dc371d ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file2.txtrusted.gfid


=========================================================

Node: fs-davids-c1-n4

Number of Scrubbed files: 1

Number of Skipped files: 0

Last completed scrub time: 2018-11-28 15:19:06

Duration of last scrub (D:M:H:M:S): 0:0:0:3

Error count: 2

Corrupted object's [GFID]:

5d1fc972-cb11-4586-ac94-30a7cbc0690f ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file1.txtrusted.gfid

cc9dc42f-d4ef-4ac7-8b1e-464e99dc371d ==> BRICK: /gluster/brick1/glusterbrick
 path: /data/file2.txtrusted.gfid

=========================================================


Actual results:
Both files are marked as corrupted but this seems to be very unlikely for
freshly created files. The 'path' in the output has an suffix "rusted.gfid"
which is the name of an extended attribut.

Expected results:
Files should not be marked as corrupted and if there is really a file corrupted
there should be a correct 'path' in the output

Additional info:

fs-davids-c1-n1:/home/iternity # gluster vo info archive1

Volume Name: archive1
Type: Replicate
Volume ID: 6f238af7-5fef-4bea-bb59-ac4e9ea5bf3a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: fs-davids-c1-n1:/gluster/brick1/glusterbrick
Brick2: fs-davids-c1-n2:/gluster/brick1/glusterbrick
Brick3: fs-davids-c1-n3:/gluster/brick1/glusterbrick
Brick4: fs-davids-c1-n4:/gluster/brick1/glusterbrick
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
user.smb: disable
features.read-only: off
features.worm: off
features.worm-file-level: on
features.retention-mode: enterprise
features.default-retention-period: 120
network.ping-timeout: 10
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.nl-cache: on
performance.nl-cache-timeout: 600
client.event-threads: 32
server.event-threads: 32
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
performance.cache-samba-metadata: on
performance.cache-ima-xattrs: on
performance.io-thread-count: 64
cluster.use-compound-fops: on
performance.cache-size: 512MB
performance.cache-refresh-timeout: 10
performance.read-ahead: off
performance.write-behind-window-size: 4MB
performance.write-behind: on
storage.build-pgfid: on
auth.ssl-allow: *
client.ssl: on
server.ssl: on
features.utime: on
storage.ctime: on
features.bitrot: on
features.scrub: Active
features.scrub-freq: daily
cluster.enable-shared-storage: enable

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
You are the Docs Contact for the bug.


More information about the Bugs mailing list