[Bugs] [Bug 1357975] New: [Bitrot+Sharding] Scrub status shows incorrect values for 'files scrubbed' and ' files skipped'
bugzilla at redhat.com
bugzilla at redhat.com
Tue Jul 19 17:41:13 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1357975
Bug ID: 1357975
Summary: [Bitrot+Sharding] Scrub status shows incorrect values
for 'files scrubbed' and 'files skipped'
Product: GlusterFS
Version: 3.8.1
Component: bitrot
Keywords: ZStream
Severity: medium
Assignee: bugs at gluster.org
Reporter: khiremat at redhat.com
CC: bugs at gluster.org, khiremat at redhat.com,
mzywusko at redhat.com, rhinduja at redhat.com,
sanandpa at redhat.com
Depends On: 1337450, 1356851, 1357973
Docs Contact: bugs at gluster.org
+++ This bug was initially created as a clone of Bug #1357973 +++
+++ This bug was initially created as a clone of Bug #1356851 +++
+++ This bug was initially created as a clone of Bug #1337450 +++
Description of problem:
========================
In a sharded volume, where every file is split into multiple shards, the
scrubber runs and validates every file (and its shards), but instead of
incrementing once for every file, it does once for every shard. The same gets
reflected in the scrub status output for the fields 'files scrubbed' and 'files
skipped' - which is misleading to the user as the number there is much more
than the total number of files created.
Version-Release number of selected component (if applicable):
===========================================================
How reproducible:
=================
Always
Steps to Reproduce:
=====================
1. Have a dist-rep volume, and enable sharding.
2. Create 100 1MB files and validate the scrub status output after its run.
3. Create 5 4G files and wait for the next scrub run.
4. Validate the scrub status output after the scrubber has finished running.
Actual results:
================
'files scrubbed' and 'files skipped' show the number much more than the total
number of files created.
Expected results:
=================
All the fields should be in line with the data actually created.
Additional info:
==================
[root at dhcp35-210 ~]#
[root at dhcp35-210 ~]# rpm -qa | grep gluster
glusterfs-client-xlators-3.7.9-4.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-libs-3.7.9-4.el7rhgs.x86_64
glusterfs-api-3.7.9-4.el7rhgs.x86_64
gluster-nagios-addons-0.2.7-1.el7rhgs.x86_64
python-gluster-3.7.5-19.el7rhgs.noarch
glusterfs-3.7.9-4.el7rhgs.x86_64
glusterfs-cli-3.7.9-4.el7rhgs.x86_64
glusterfs-server-3.7.9-4.el7rhgs.x86_64
glusterfs-fuse-3.7.9-4.el7rhgs.x86_64
[root at dhcp35-210 ~]#
[root at dhcp35-210 ~]#
[root at dhcp35-210 ~]# gluster peer status
Number of Peers: 3
Hostname: 10.70.35.85
Uuid: c9550322-c0ef-45e6-ad20-f38658a5ce54
State: Peer in Cluster (Connected)
Hostname: 10.70.35.137
Uuid: 35426000-dad1-416f-b145-f25049f5036e
State: Peer in Cluster (Connected)
Hostname: 10.70.35.13
Uuid: a756f3da-7896-4970-a77d-4829e603f773
State: Peer in Cluster (Connected)
[root at dhcp35-210 ~]#
[root at dhcp35-210 ~]# gluster v info
Volume Name: ozone
Type: Distributed-Replicate
Volume ID: d79e220b-acde-4d13-b9d5-f37ec741c117
Status: Started
Number of Bricks: 3 x 3 = 9
Transport-type: tcp
Bricks:
Brick1: 10.70.35.210:/bricks/brick1/ozone
Brick2: 10.70.35.85:/bricks/brick1/ozone
Brick3: 10.70.35.137:/bricks/brick1/ozone
Brick4: 10.70.35.210:/bricks/brick2/ozone
Brick5: 10.70.35.85:/bricks/brick2/ozone
Brick6: 10.70.35.137:/bricks/brick2/ozone
Brick7: 10.70.35.210:/bricks/brick3/ozone
Brick8: 10.70.35.85:/bricks/brick3/ozone
Brick9: 10.70.35.137:/bricks/brick3/ozone
Options Reconfigured:
features.shard: on
features.scrub-throttle: normal
features.scrub-freq: hourly
features.scrub: Active
features.bitrot: on
performance.readdir-ahead: on
[root at dhcp35-210 ~]#
[root at dhcp35-210 ~]# gluster v status
Status of volume: ozone
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.70.35.210:/bricks/brick1/ozone 49152 0 Y 3255
Brick 10.70.35.85:/bricks/brick1/ozone 49152 0 Y 15549
Brick 10.70.35.137:/bricks/brick1/ozone 49152 0 Y 32158
Brick 10.70.35.210:/bricks/brick2/ozone 49153 0 Y 3261
Brick 10.70.35.85:/bricks/brick2/ozone 49153 0 Y 15557
Brick 10.70.35.137:/bricks/brick2/ozone 49153 0 Y 32164
Brick 10.70.35.210:/bricks/brick3/ozone 49154 0 Y 3270
Brick 10.70.35.85:/bricks/brick3/ozone 49154 0 Y 15564
Brick 10.70.35.137:/bricks/brick3/ozone 49154 0 Y 32171
NFS Server on localhost 2049 0 Y 24614
Self-heal Daemon on localhost N/A N/A Y 3248
Bitrot Daemon on localhost N/A N/A Y 8545
Scrubber Daemon on localhost N/A N/A Y 8551
NFS Server on 10.70.35.13 2049 0 Y 6082
Self-heal Daemon on 10.70.35.13 N/A N/A Y 21680
Bitrot Daemon on 10.70.35.13 N/A N/A N N/A
Scrubber Daemon on 10.70.35.13 N/A N/A N N/A
NFS Server on 10.70.35.85 2049 0 Y 9515
Self-heal Daemon on 10.70.35.85 N/A N/A Y 15542
Bitrot Daemon on 10.70.35.85 N/A N/A Y 18642
Scrubber Daemon on 10.70.35.85 N/A N/A Y 18648
NFS Server on 10.70.35.137 2049 0 Y 26213
Self-heal Daemon on 10.70.35.137 N/A N/A Y 32153
Bitrot Daemon on 10.70.35.137 N/A N/A Y 2919
Scrubber Daemon on 10.70.35.137 N/A N/A Y 2925
Task Status of Volume ozone
------------------------------------------------------------------------------
There are no active volume tasks
[root at dhcp35-210 ~]#
[root at dhcp35-210 ~]# gluster v bitrot ozone scrub status
Volume name : ozone
State of scrub: Active
Scrub impact: normal
Scrub frequency: hourly
Bitrot error log location: /var/log/glusterfs/bitd.log
Scrubber error log location: /var/log/glusterfs/scrub.log
=========================================================
Node: localhost
Number of Scrubbed files: 4930
Number of Skipped files: 0
Last completed scrub time: 2016-05-19 07:40:18
Duration of last scrub (D:M:H:M:S): 0:0:30:35
Error count: 1
Corrupted object's [GFID]:
2be8fc38-db5e-464b-b741-616377994cc8
=========================================================
Node: 10.70.35.85
Number of Scrubbed files: 5139
Number of Skipped files: 0
Last completed scrub time: 2016-05-19 08:49:49
Duration of last scrub (D:M:H:M:S): 0:0:29:39
Error count: 1
Corrupted object's [GFID]:
ce5e7a94-cba6-4e65-a7bb-82b1ec396eef
=========================================================
Node: 10.70.35.137
Number of Scrubbed files: 5138
Number of Skipped files: 0
Last completed scrub time: 2016-05-19 09:02:46
Duration of last scrub (D:M:H:M:S): 0:0:31:57
Error count: 0
=========================================================
[root at dhcp35-210 ~]#
=============
CLIENT LOGS
==============
[root at dhcp35-30 ~]#
[root at dhcp35-30 ~]# cd /mnt/ozone
[root at dhcp35-30 ozone]# df -k .
Filesystem 1K-blocks Used Available Use% Mounted on
10.70.35.137:/ozone 62553600 21098496 41455104 34% /mnt/ozone
[root at dhcp35-30 ozone]#
[root at dhcp35-30 ozone]#
[root at dhcp35-30 ozone]# ls -a
. .. 1m_files 4g_files .trashcan
[root at dhcp35-30 ozone]#
[root at dhcp35-30 ozone]#
[root at dhcp35-30 ozone]# ls -l 1m_files/ | wc -l
21
[root at dhcp35-30 ozone]# ls -l 4g_files/ | wc -l
6
[root at dhcp35-30 ozone]#
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1337450
[Bug 1337450] [Bitrot+Sharding] Scrub status shows incorrect values for
'files scrubbed' and 'files skipped'
https://bugzilla.redhat.com/show_bug.cgi?id=1356851
[Bug 1356851] [Bitrot+Sharding] Scrub status shows incorrect values for
'files scrubbed' and 'files skipped'
https://bugzilla.redhat.com/show_bug.cgi?id=1357973
[Bug 1357973] [Bitrot+Sharding] Scrub status shows incorrect values for
'files scrubbed' and 'files skipped'
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
You are the Docs Contact for the bug.
More information about the Bugs
mailing list