[Gluster-users] 3.8.3 Bitrot signature process

Amudhan P amudhan83 at gmail.com
Thu Sep 22 13:27:55 UTC 2016


Hi Kotresh,

I have raised bug.

https://bugzilla.redhat.com/show_bug.cgi?id=1378466

Thanks
Amudhan

On Thu, Sep 22, 2016 at 2:45 PM, Kotresh Hiremath Ravishankar <
khiremat at redhat.com> wrote:

> Hi Amudhan,
>
> It's as of now, hard coded based on some testing results. That part is not
> tune-able yet.
> Only scrubber throttling is tune-able. As I have told you, because brick
> process has
> an open fd, bitrot signer process is not picking it up for scrubbing.
> Please raise
> a bug. We will take a look at it.
>
> Thanks and Regards,
> Kotresh H R
>
> ----- Original Message -----
> > From: "Amudhan P" <amudhan83 at gmail.com>
> > To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> > Cc: "Gluster Users" <gluster-users at gluster.org>
> > Sent: Thursday, September 22, 2016 2:37:25 PM
> > Subject: Re: 3.8.3 Bitrot signature process
> >
> > Hi Kotresh,
> >
> > its same behaviour in replicated volume also, file fd opens after 120
> > seconds in brick pid.
> >
> > for calculating signature for 100MB file it took 15m57s.
> >
> >
> > How can i increase CPU usage?, in your earlier mail you have said "To
> limit
> > the usage of CPU, throttling is done using token bucket algorithm".
> > any possibility of increasing bitrot hash calculation speed ?.
> >
> >
> > Thanks,
> > Amudhan
> >
> >
> > On Thu, Sep 22, 2016 at 11:44 AM, Kotresh Hiremath Ravishankar <
> > khiremat at redhat.com> wrote:
> >
> > > Hi Amudhan,
> > >
> > > Thanks for the confirmation. If that's the case please try with
> dist-rep
> > > volume,
> > > and see if you are observing similar behavior.
> > >
> > > In any case please raise a bug for the same with your observations. We
> > > will work
> > > on it.
> > >
> > > Thanks and Regards,
> > > Kotresh H R
> > >
> > > ----- Original Message -----
> > > > From: "Amudhan P" <amudhan83 at gmail.com>
> > > > To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> > > > Cc: "Gluster Users" <gluster-users at gluster.org>
> > > > Sent: Thursday, September 22, 2016 11:25:28 AM
> > > > Subject: Re: 3.8.3 Bitrot signature process
> > > >
> > > > Hi Kotresh,
> > > >
> > > > 2280 is a brick process, i have not tried with dist-rep volume?
> > > >
> > > > I have not seen any fd in bitd process in any of the node's and bitd
> > > > process usage always 0% CPU and randomly it goes 0.3% CPU.
> > > >
> > > >
> > > >
> > > > Thanks,
> > > > Amudhan
> > > >
> > > > On Thursday, September 22, 2016, Kotresh Hiremath Ravishankar <
> > > > khiremat at redhat.com> wrote:
> > > > > Hi Amudhan,
> > > > >
> > > > > No, bitrot signer is a different process by itself and is not part
> of
> > > > brick process.
> > > > > I believe the process 2280 is a brick process ? Did you check with
> > > > dist-rep volume?
> > > > > Is the same behavior being observed there as well? We need to
> figure
> > > out
> > > > why brick
> > > > > process is holding that fd for such a long time.
> > > > >
> > > > > Thanks and Regards,
> > > > > Kotresh H R
> > > > >
> > > > > ----- Original Message -----
> > > > >> From: "Amudhan P" <amudhan83 at gmail.com>
> > > > >> To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> > > > >> Sent: Wednesday, September 21, 2016 8:15:33 PM
> > > > >> Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process
> > > > >>
> > > > >> Hi Kotresh,
> > > > >>
> > > > >> As soon as fd closes from brick1 pid, i can see bitrot signature
> for
> > > the
> > > > >> file in brick.
> > > > >>
> > > > >> So, it looks like fd opened by brick process to calculate
> signature.
> > > > >>
> > > > >> output of the file:
> > > > >>
> > > > >> -rw-r--r-- 2 root root 250M Sep 21 18:32
> > > > >> /media/disk1/brick1/data/G/test59-bs10M-c100.nul
> > > > >>
> > > > >> getfattr: Removing leading '/' from absolute path names
> > > > >> # file: media/disk1/brick1/data/G/test59-bs10M-c100.nul
> > > > >> trusted.bit-rot.signature=0x010200000000000000e9474e4cc6
> > > > 73c0c227a6e807e04aa4ab1f88d3744243950a290869c53daa65df
> > > > >> trusted.bit-rot.version=0x020000000000000057d6af3200012a13
> > > > >> trusted.ec.config=0x0000080501000200
> > > > >> trusted.ec.size=0x000000003e800000
> > > > >> trusted.ec.version=0x0000000000001f400000000000001f40
> > > > >> trusted.gfid=0x4c091145429448468fffe358482c63e1
> > > > >>
> > > > >> stat /media/disk1/brick1/data/G/test59-bs10M-c100.nul
> > > > >>   File: ‘/media/disk1/brick1/data/G/test59-bs10M-c100.nul’
> > > > >>   Size: 262144000       Blocks: 512000     IO Block: 4096
>  regular
> > > file
> > > > >> Device: 811h/2065d      Inode: 402653311   Links: 2
> > > > >> Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/
> > > root)
> > > > >> Access: 2016-09-21 18:34:43.722712751 +0530
> > > > >> Modify: 2016-09-21 18:32:41.650712946 +0530
> > > > >> Change: 2016-09-21 19:14:41.698708914 +0530
> > > > >>  Birth: -
> > > > >>
> > > > >>
> > > > >> In other 2 bricks in same set, still signature is not updated for
> the
> > > > same
> > > > >> file.
> > > > >>
> > > > >>
> > > > >> On Wed, Sep 21, 2016 at 6:48 PM, Amudhan P <amudhan83 at gmail.com>
> > > wrote:
> > > > >>
> > > > >> > Hi Kotresh,
> > > > >> >
> > > > >> > I am very sure, No read was going on from mount point.
> > > > >> >
> > > > >> > Again i did same test but after writing data to mount point. I
> have
> > > > >> > unmounted mount point.
> > > > >> >
> > > > >> > after 120 seconds i am seeing this file fd entry in brick 1 pid
> > > > >> >
> > > > >> > getfattr -m. -e hex -d test59-bs10
> > > > >> > # file: test59-bs10M-c100.nul
> > > > >> > trusted.bit-rot.version=0x020000000000000057bed574000ed534
> > > > >> > trusted.ec.config=0x0000080501000200
> > > > >> > trusted.ec.size=0x000000003e800000
> > > > >> > trusted.ec.version=0x0000000000001f400000000000001f40
> > > > >> > trusted.gfid=0x4c091145429448468fffe358482c63e1
> > > > >> >
> > > > >> >
> > > > >> > ls -l /proc/2280/fd
> > > > >> > lr-x------ 1 root root 64 Sep 21 13:08 19 ->
> /media/disk1/brick1/.
> > > > >> > glusterfs/4c/09/4c091145-4294-4846-8fff-e358482c63e1
> > > > >> >
> > > > >> > Volume is a EC - 4+1
> > > > >> >
> > > > >> > On Wed, Sep 21, 2016 at 6:17 PM, Kotresh Hiremath Ravishankar <
> > > > >> > khiremat at redhat.com> wrote:
> > > > >> >
> > > > >> >> Hi Amudhan,
> > > > >> >>
> > > > >> >> If you see the ls output, some process has a fd opened in the
> > > backend.
> > > > >> >> That is the reason bitrot is not considering for the signing.
> > > > >> >> Could you please observe, after 120 secs of closure of
> > > > >> >> "/media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-
> > > > >> >> 85bf-f21f99fd8764"
> > > > >> >> the signing happens. If so we need to figure out who holds
> this fd
> > > for
> > > > >> >> such a long time.
> > > > >> >> And also we need to figure is this issue specific to EC volume.
> > > > >> >>
> > > > >> >> Thanks and Regards,
> > > > >> >> Kotresh H R
> > > > >> >>
> > > > >> >> ----- Original Message -----
> > > > >> >> > From: "Amudhan P" <amudhan83 at gmail.com>
> > > > >> >> > To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> > > > >> >> > Cc: "Gluster Users" <gluster-users at gluster.org>
> > > > >> >> > Sent: Wednesday, September 21, 2016 4:56:40 PM
> > > > >> >> > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature process
> > > > >> >> >
> > > > >> >> > Hi Kotresh,
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > Writing new file.
> > > > >> >> >
> > > > >> >> > getfattr -m. -e hex -d /media/disk2/brick2/data/G/
> > > > test58-bs10M-c100.nul
> > > > >> >> > getfattr: Removing leading '/' from absolute path names
> > > > >> >> > # file: media/disk2/brick2/data/G/test58-bs10M-c100.nul
> > > > >> >> > trusted.bit-rot.version=0x020000000000000057da8b23000b120e
> > > > >> >> > trusted.ec.config=0x0000080501000200
> > > > >> >> > trusted.ec.size=0x000000003e800000
> > > > >> >> > trusted.ec.version=0x0000000000001f400000000000001f40
> > > > >> >> > trusted.gfid=0x6e7c49e6094e443585bff21f99fd8764
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > Running ls -l in brick 2 pid
> > > > >> >> >
> > > > >> >> > ls -l /proc/30162/fd
> > > > >> >> >
> > > > >> >> > lr-x------ 1 root root 64 Sep 21 16:22 59 ->
> > > > >> >> > /media/disk2/brick2/.glusterfs/quanrantine
> > > > >> >> > lrwx------ 1 root root 64 Sep 21 16:22 6 ->
> > > > >> >> > /var/lib/glusterd/vols/glsvol1/run/10.1.2.2-media-
> > > disk2-brick2.pid
> > > > >> >> > lr-x------ 1 root root 64 Sep 21 16:25 60 ->
> > > > >> >> > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-
> > > > >> >> 85bf-f21f99fd8764
> > > > >> >> > lr-x------ 1 root root 64 Sep 21 16:22 61 ->
> > > > >> >> > /media/disk2/brick2/.glusterfs/quanrantine
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > find /media/disk2/ -samefile
> > > > >> >> > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-
> > > > >> >> 85bf-f21f99fd8764
> > > > >> >> > /media/disk2/brick2/.glusterfs/6e/7c/6e7c49e6-094e-4435-
> > > > >> >> 85bf-f21f99fd8764
> > > > >> >> > /media/disk2/brick2/data/G/test58-bs10M-c100.nul
> > > > >> >> >
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > On Wed, Sep 21, 2016 at 3:28 PM, Kotresh Hiremath
> Ravishankar <
> > > > >> >> > khiremat at redhat.com> wrote:
> > > > >> >> >
> > > > >> >> > > Hi Amudhan,
> > > > >> >> > >
> > > > >> >> > > Don't grep for the filename, glusterfs maintains hardlink
> in
> > > > >> >> .glusterfs
> > > > >> >> > > directory
> > > > >> >> > > for each file. Just check 'ls -l /proc/<respective brick
> > > pid>/fd'
> > > > for
> > > > >> >> any
> > > > >> >> > > fds opened
> > > > >> >> > > for a file in .glusterfs and check if it's the same file.
> > > > >> >> > >
> > > > >> >> > > Thanks and Regards,
> > > > >> >> > > Kotresh H R
> > > > >> >> > >
> > > > >> >> > > ----- Original Message -----
> > > > >> >> > > > From: "Amudhan P" <amudhan83 at gmail.com>
> > > > >> >> > > > To: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com>
> > > > >> >> > > > Cc: "Gluster Users" <gluster-users at gluster.org>
> > > > >> >> > > > Sent: Wednesday, September 21, 2016 1:33:10 PM
> > > > >> >> > > > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature
> process
> > > > >> >> > > >
> > > > >> >> > > > Hi Kotresh,
> > > > >> >> > > >
> > > > >> >> > > > i have used below command to verify any open fd for file.
> > > > >> >> > > >
> > > > >> >> > > > "ls -l /proc/*/fd | grep filename".
> > > > >> >> > > >
> > > > >> >> > > > as soon as write completes there no open fd's, if there
> is
> > > any
> > > > >> >> alternate
> > > > >> >> > > > option. please let me know will also try that.
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > > Also, below is my scrub status in my test setup. number
> of
> > > > skipped
> > > > >> >> files
> > > > >> >> > > > slow reducing day by day. I think files are skipped due
> to
> > > > bitrot
> > > > >> >> > > signature
> > > > >> >> > > > process is not completed yet.
> > > > >> >> > > >
> > > > >> >> > > > where can i see scrub skipped files?
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > > Volume name : glsvol1
> > > > >> >> > > >
> > > > >> >> > > > State of scrub: Active (Idle)
> > > > >> >> > > >
> > > > >> >> > > > Scrub impact: normal
> > > > >> >> > > >
> > > > >> >> > > > Scrub frequency: daily
> > > > >> >> > > >
> > > > >> >> > > > Bitrot error log location: /var/log/glusterfs/bitd.log
> > > > >> >> > > >
> > > > >> >> > > > Scrubber error log location: /var/log/glusterfs/scrub.log
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > > ==============================
> ===========================
> > > > >> >> > > >
> > > > >> >> > > > Node: localhost
> > > > >> >> > > >
> > > > >> >> > > > Number of Scrubbed files: 1644
> > > > >> >> > > >
> > > > >> >> > > > Number of Skipped files: 1001
> > > > >> >> > > >
> > > > >> >> > > > Last completed scrub time: 2016-09-20 11:59:58
> > > > >> >> > > >
> > > > >> >> > > > Duration of last scrub (D:M:H:M:S): 0:0:39:26
> > > > >> >> > > >
> > > > >> >> > > > Error count: 0
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > > ==============================
> ===========================
> > > > >> >> > > >
> > > > >> >> > > > Node: 10.1.2.3
> > > > >> >> > > >
> > > > >> >> > > > Number of Scrubbed files: 1644
> > > > >> >> > > >
> > > > >> >> > > > Number of Skipped files: 1001
> > > > >> >> > > >
> > > > >> >> > > > Last completed scrub time: 2016-09-20 10:50:00
> > > > >> >> > > >
> > > > >> >> > > > Duration of last scrub (D:M:H:M:S): 0:0:38:17
> > > > >> >> > > >
> > > > >> >> > > > Error count: 0
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > > ==============================
> ===========================
> > > > >> >> > > >
> > > > >> >> > > > Node: 10.1.2.4
> > > > >> >> > > >
> > > > >> >> > > > Number of Scrubbed files: 981
> > > > >> >> > > >
> > > > >> >> > > > Number of Skipped files: 1664
> > > > >> >> > > >
> > > > >> >> > > > Last completed scrub time: 2016-09-20 12:38:01
> > > > >> >> > > >
> > > > >> >> > > > Duration of last scrub (D:M:H:M:S): 0:0:35:19
> > > > >> >> > > >
> > > > >> >> > > > Error count: 0
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > > ==============================
> ===========================
> > > > >> >> > > >
> > > > >> >> > > > Node: 10.1.2.1
> > > > >> >> > > >
> > > > >> >> > > > Number of Scrubbed files: 1263
> > > > >> >> > > >
> > > > >> >> > > > Number of Skipped files: 1382
> > > > >> >> > > >
> > > > >> >> > > > Last completed scrub time: 2016-09-20 11:57:21
> > > > >> >> > > >
> > > > >> >> > > > Duration of last scrub (D:M:H:M:S): 0:0:37:17
> > > > >> >> > > >
> > > > >> >> > > > Error count: 0
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > > ==============================
> ===========================
> > > > >> >> > > >
> > > > >> >> > > > Node: 10.1.2.2
> > > > >> >> > > >
> > > > >> >> > > > Number of Scrubbed files: 1644
> > > > >> >> > > >
> > > > >> >> > > > Number of Skipped files: 1001
> > > > >> >> > > >
> > > > >> >> > > > Last completed scrub time: 2016-09-20 11:59:25
> > > > >> >> > > >
> > > > >> >> > > > Duration of last scrub (D:M:H:M:S): 0:0:39:18
> > > > >> >> > > >
> > > > >> >> > > > Error count: 0
> > > > >> >> > > >
> > > > >> >> > > > ==============================
> ===========================
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > > Thanks
> > > > >> >> > > > Amudhan
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > > On Wed, Sep 21, 2016 at 11:45 AM, Kotresh Hiremath
> > > Ravishankar <
> > > > >> >> > > > khiremat at redhat.com> wrote:
> > > > >> >> > > >
> > > > >> >> > > > > Hi Amudhan,
> > > > >> >> > > > >
> > > > >> >> > > > > I don't think it's the limitation with read data from
> the
> > > > brick.
> > > > >> >> > > > > To limit the usage of CPU, throttling is done using
> token
> > > > bucket
> > > > >> >> > > > > algorithm. The log message showed is related to it. But
> > > even
> > > > then
> > > > >> >> > > > > I think it should not take 12 minutes for check-sum
> > > > calculation
> > > > >> >> unless
> > > > >> >> > > > > there is an fd open (might be internal). Could you
> please
> > > > cross
> > > > >> >> verify
> > > > >> >> > > > > if there are any fd opened on that file by looking into
> > > > /proc? I
> > > > >> >> will
> > > > >> >> > > > > also test it out in the mean time and get back to you.
> > > > >> >> > > > >
> > > > >> >> > > > > Thanks and Regards,
> > > > >> >> > > > > Kotresh H R
> > > > >> >> > > > >
> > > > >> >> > > > > ----- Original Message -----
> > > > >> >> > > > > > From: "Amudhan P" <amudhan83 at gmail.com>
> > > > >> >> > > > > > To: "Kotresh Hiremath Ravishankar" <
> khiremat at redhat.com>
> > > > >> >> > > > > > Cc: "Gluster Users" <gluster-users at gluster.org>
> > > > >> >> > > > > > Sent: Tuesday, September 20, 2016 3:19:28 PM
> > > > >> >> > > > > > Subject: Re: [Gluster-users] 3.8.3 Bitrot signature
> > > process
> > > > >> >> > > > > >
> > > > >> >> > > > > > Hi Kotresh,
> > > > >> >> > > > > >
> > > > >> >> > > > > > Please correct me if i am wrong, Once a file write
> > > completes
> > > > >> >> and as
> > > > >> >> > > soon
> > > > >> >> > > > > as
> > > > >> >> > > > > > closes fds, bitrot waits for 120 seconds and starts
> > > hashing
> > > > and
> > > > >> >> > > update
> > > > >> >> > > > > > signature for the file in brick.
> > > > >> >> > > > > >
> > > > >> >> > > > > > But, what i am feeling that bitrot takes too much of
> > > time to
> > > > >> >> complete
> > > > >> >> > > > > > hashing.
> > > > >> >> > > > > >
> > > > >> >> > > > > > below is test result i would like to share.
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > writing data in below path using dd :
> > > > >> >> > > > > >
> > > > >> >> > > > > > /mnt/gluster/data/G (mount point)
> > > > >> >> > > > > > -rw-r--r-- 1 root root  10M Sep 20 12:19
> > > test53-bs10M-c1.nul
> > > > >> >> > > > > > -rw-r--r-- 1 root root 100M Sep 20 12:19
> > > > test54-bs10M-c10.nul
> > > > >> >> > > > > >
> > > > >> >> > > > > > No any other write or read process is going on.
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > Checking file data in one of the brick.
> > > > >> >> > > > > >
> > > > >> >> > > > > > -rw-r--r-- 2 root root 2.5M Sep 20 12:23
> > > test53-bs10M-c1.nul
> > > > >> >> > > > > > -rw-r--r-- 2 root root  25M Sep 20 12:23
> > > > test54-bs10M-c10.nul
> > > > >> >> > > > > >
> > > > >> >> > > > > > file's stat and getfattr info from brick, after write
> > > > process
> > > > >> >> > > completed.
> > > > >> >> > > > > >
> > > > >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ stat
> > > > >> >> test53-bs10M-c1.nul
> > > > >> >> > > > > >   File: ‘test53-bs10M-c1.nul’
> > > > >> >> > > > > >   Size: 2621440         Blocks: 5120       IO Block:
> 4096
> > > > >> >>  regular
> > > > >> >> > > file
> > > > >> >> > > > > > Device: 821h/2081d      Inode: 536874168   Links: 2
> > > > >> >> > > > > > Access: (0644/-rw-r--r--)  Uid: (    0/    root)
>  Gid: (
> > > >   0/
> > > > >> >> > > root)
> > > > >> >> > > > > > Access: 2016-09-20 12:23:28.798886647 +0530
> > > > >> >> > > > > > Modify: 2016-09-20 12:23:28.994886646 +0530
> > > > >> >> > > > > > Change: 2016-09-20 12:23:28.998886646 +0530
> > > > >> >> > > > > >  Birth: -
> > > > >> >> > > > > >
> > > > >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ stat
> > > > >> >> test54-bs10M-c10.nul
> > > > >> >> > > > > >   File: ‘test54-bs10M-c10.nul’
> > > > >> >> > > > > >   Size: 26214400        Blocks: 51200      IO Block:
> 4096
> > > > >> >>  regular
> > > > >> >> > > file
> > > > >> >> > > > > > Device: 821h/2081d      Inode: 536874169   Links: 2
> > > > >> >> > > > > > Access: (0644/-rw-r--r--)  Uid: (    0/    root)
>  Gid: (
> > > >   0/
> > > > >> >> > > root)
> > > > >> >> > > > > > Access: 2016-09-20 12:23:42.902886624 +0530
> > > > >> >> > > > > > Modify: 2016-09-20 12:23:44.378886622 +0530
> > > > >> >> > > > > > Change: 2016-09-20 12:23:44.378886622 +0530
> > > > >> >> > > > > >  Birth: -
> > > > >> >> > > > > >
> > > > >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo
> getfattr
> > > -m.
> > > > -e
> > > > >> >> hex -d
> > > > >> >> > > > > > test53-bs10M-c1.nul
> > > > >> >> > > > > > # file: test53-bs10M-c1.nul
> > > > >> >> > > > > > trusted.bit-rot.version=
> 0x020000000000000057daa7b50002
> > > e5b4
> > > > >> >> > > > > > trusted.ec.config=0x0000080501000200
> > > > >> >> > > > > > trusted.ec.size=0x0000000000a00000
> > > > >> >> > > > > > trusted.ec.version=0x0000000000000050000000000000
> 0050
> > > > >> >> > > > > > trusted.gfid=0xe2416bd1aae4403c88f44286273bbe99
> > > > >> >> > > > > >
> > > > >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo
> getfattr
> > > -m.
> > > > -e
> > > > >> >> hex -d
> > > > >> >> > > > > > test54-bs10M-c10.nul
> > > > >> >> > > > > > # file: test54-bs10M-c10.nul
> > > > >> >> > > > > > trusted.bit-rot.version=
> 0x020000000000000057daa7b50002
> > > e5b4
> > > > >> >> > > > > > trusted.ec.config=0x0000080501000200
> > > > >> >> > > > > > trusted.ec.size=0x0000000006400000
> > > > >> >> > > > > > trusted.ec.version=0x0000000000000320000000000000
> 0320
> > > > >> >> > > > > > trusted.gfid=0x54e018dd8c5a4bd79e0317729d8a57c5
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > file's stat and getfattr info from brick, after
> bitrot
> > > > signature
> > > > >> >> > > updated.
> > > > >> >> > > > > >
> > > > >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ stat
> > > > >> >> test53-bs10M-c1.nul
> > > > >> >> > > > > >   File: ‘test53-bs10M-c1.nul’
> > > > >> >> > > > > >   Size: 2621440         Blocks: 5120       IO Block:
> 4096
> > > > >> >>  regular
> > > > >> >> > > file
> > > > >> >> > > > > > Device: 821h/2081d      Inode: 536874168   Links: 2
> > > > >> >> > > > > > Access: (0644/-rw-r--r--)  Uid: (    0/    root)
>  Gid: (
> > > >   0/
> > > > >> >> > > root)
> > > > >> >> > > > > > Access: 2016-09-20 12:25:31.494886450 +0530
> > > > >> >> > > > > > Modify: 2016-09-20 12:23:28.994886646 +0530
> > > > >> >> > > > > > Change: 2016-09-20 12:27:00.994886307 +0530
> > > > >> >> > > > > >  Birth: -
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo
> getfattr
> > > -m.
> > > > -e
> > > > >> >> hex -d
> > > > >> >> > > > > > test53-bs10M-c1.nul
> > > > >> >> > > > > > # file: test53-bs10M-c1.nul
> > > > >> >> > > > > > trusted.bit-rot.signature=
> 0x0102000000000000006de7493c5c
> > > > >> >> > > > > 90f643357c268fbaaf461c1567e0334e4948023ce17268403aa37a
> > > > >> >> > > > > > trusted.bit-rot.version=
> 0x020000000000000057daa7b50002
> > > e5b4
> > > > >> >> > > > > > trusted.ec.config=0x0000080501000200
> > > > >> >> > > > > > trusted.ec.size=0x0000000000a00000
> > > > >> >> > > > > > trusted.ec.version=0x0000000000000050000000000000
> 0050
> > > > >> >> > > > > > trusted.gfid=0xe2416bd1aae4403c88f44286273bbe99
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ stat
> > > > >> >> test54-bs10M-c10.nul
> > > > >> >> > > > > >   File: ‘test54-bs10M-c10.nul’
> > > > >> >> > > > > >   Size: 26214400        Blocks: 51200      IO Block:
> 4096
> > > > >> >>  regular
> > > > >> >> > > file
> > > > >> >> > > > > > Device: 821h/2081d      Inode: 536874169   Links: 2
> > > > >> >> > > > > > Access: (0644/-rw-r--r--)  Uid: (    0/    root)
>  Gid: (
> > > >   0/
> > > > >> >> > > root)
> > > > >> >> > > > > > Access: 2016-09-20 12:25:47.510886425 +0530
> > > > >> >> > > > > > Modify: 2016-09-20 12:23:44.378886622 +0530
> > > > >> >> > > > > > Change: 2016-09-20 12:38:05.954885243 +0530
> > > > >> >> > > > > >  Birth: -
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ sudo
> getfattr
> > > -m.
> > > > -e
> > > > >> >> hex -d
> > > > >> >> > > > > > test54-bs10M-c10.nul
> > > > >> >> > > > > > # file: test54-bs10M-c10.nul
> > > > >> >> > > > > > trusted.bit-rot.signature=
> 0x010200000000000000394c345f0b
> > > > >> >> > > > > 0c63ee652627a62eed069244d35c4d5134e4f07d4eabb51afda47e
> > > > >> >> > > > > > trusted.bit-rot.version=
> 0x020000000000000057daa7b50002
> > > e5b4
> > > > >> >> > > > > > trusted.ec.config=0x0000080501000200
> > > > >> >> > > > > > trusted.ec.size=0x0000000006400000
> > > > >> >> > > > > > trusted.ec.version=0x0000000000000320000000000000
> 0320
> > > > >> >> > > > > > trusted.gfid=0x54e018dd8c5a4bd79e0317729d8a57c5
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > (Actual time taken for reading file from brick for
> > > md5sum)
> > > > >> >> > > > > >
> > > > >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ time md5sum
> > > > >> >> > > test53-bs10M-c1.nul
> > > > >> >> > > > > > 8354dcaa18a1ecb52d0895bf00888c44
> test53-bs10M-c1.nul
> > > > >> >> > > > > >
> > > > >> >> > > > > > real    0m0.045s
> > > > >> >> > > > > > user    0m0.007s
> > > > >> >> > > > > > sys     0m0.003s
> > > > >> >> > > > > >
> > > > >> >> > > > > > gfstst-node5:/media/disk2/brick2/data/G$ time md5sum
> > > > >> >> > > > > test54-bs10M-c10.nul
> > > > >> >> > > > > > bed3c0a4a1407f584989b4009e9ce33f
> test54-bs10M-c10.nul
> > > > >> >> > > > > >
> > > > >> >> > > > > > real    0m0.166s
> > > > >> >> > > > > > user    0m0.062s
> > > > >> >> > > > > > sys     0m0.011s
> > > > >> >> > > > > >
> > > > >> >> > > > > > As you can see that 'test54-bs10M-c10.nul' file took
> > > around
> > > > 12
> > > > >> >> > > minutes to
> > > > >> >> > > > > > update bitort signature (pls refer stat output for
> the
> > > > file).
> > > > >> >> > > > > >
> > > > >> >> > > > > > what would be the cause for such a slow read?. Any
> > > > limitation
> > > > >> >> in read
> > > > >> >> > > > > data
> > > > >> >> > > > > > from brick?
> > > > >> >> > > > > >
> > > > >> >> > > > > > Also, i am seeing this line bitd.log, what does this
> > > mean?
> > > > >> >> > > > > > [bit-rot.c:1784:br_rate_limit_signer]
> > > 0-glsvol1-bit-rot-0:
> > > > >> >> [Rate
> > > > >> >> > > Limit
> > > > >> >> > > > > > Info] "tokens/sec (rate): 131072, maxlimit: 524288
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > Thanks
> > > > >> >> > > > > > Amudhan P
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > On Mon, Sep 19, 2016 at 1:00 PM, Kotresh Hiremath
> > > > Ravishankar <
> > > > >> >> > > > > > khiremat at redhat.com> wrote:
> > > > >> >> > > > > >
> > > > >> >> > > > > > > Hi Amudhan,
> > > > >> >> > > > > > >
> > > > >> >> > > > > > > Thanks for testing out the bitrot feature and
> sorry for
> > > > the
> > > > >> >> delayed
> > > > >> >> > > > > > > response.
> > > > >> >> > > > > > > Please find the answers inline.
> > > > >> >> > > > > > >
> > > > >> >> > > > > > > Thanks and Regards,
> > > > >> >> > > > > > > Kotresh H R
> > > > >> >> > > > > > >
> > > > >> >> > > > > > > ----- Original Message -----
> > > > >> >> > > > > > > > From: "Amudhan P" <amudhan83 at gmail.com>
> > > > >> >> > > > > > > > To: "Gluster Users" <gluster-users at gluster.org>
> > > > >> >> > > > > > > > Sent: Friday, September 16, 2016 4:14:10 PM
> > > > >> >> > > > > > > > Subject: Re: [Gluster-users] 3.8.3 Bitrot
> signature
> > > > process
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > > Hi,
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > > Can anyone reply to this mail.
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > > On Tue, Sep 13, 2016 at 12:49 PM, Amudhan P <
> > > > >> >> > > amudhan83 at gmail.com >
> > > > >> >> > > > > > > wrote:
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > > Hi,
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > > I am testing bitrot feature in Gluster 3.8.3 with
> > > > disperse
> > > > >> >> EC
> > > > >> >> > > volume
> > > > >> >> > > > > 4+1.
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > > When i write single small file (< 10MB) after 2
> > > seconds
> > > > i
> > > > >> >> can see
> > > > >> >> > > > > bitrot
> > > > >> >> > > > > > > > signature in bricks for the file, but when i
> write
> > > > multiple
> > > > >> >> files
> > > > >> >> > > > > with
> > > > >> >> > > > > > > > different size ( > 10MB) it takes long time (>
> 24hrs)
> > > > to see
> > > > >> >> > > bitrot
> > > > >> >> > > > > > > > signature in all the files.
> > > > >> >> > > > > > >
> > > > >> >> > > > > > >    The default timeout for signing to happen is 120
> > > > seconds.
> > > > >> >> So the
> > > > >> >> > > > > > > signing will happen
> > > > >> >> > > > > > >   120 secs after the last fd gets closed on that
> file.
> > > So
> > > > if
> > > > >> >> the
> > > > >> >> > > file
> > > > >> >> > > > > is
> > > > >> >> > > > > > > being written
> > > > >> >> > > > > > >   continuously, it will not be signed until 120
> secs
> > > after
> > > > >> >> it's
> > > > >> >> > > last
> > > > >> >> > > > > fd is
> > > > >> >> > > > > > > closed.
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > > My questions are.
> > > > >> >> > > > > > > > 1. I have enabled scrub schedule as hourly and
> > > throttle
> > > > as
> > > > >> >> > > normal,
> > > > >> >> > > > > does
> > > > >> >> > > > > > > this
> > > > >> >> > > > > > > > make any impact in delaying bitrot signature?
> > > > >> >> > > > > > >       No.
> > > > >> >> > > > > > > > 2. other than "bitd.log" where else i can watch
> > > current
> > > > >> >> status of
> > > > >> >> > > > > bitrot,
> > > > >> >> > > > > > > > like number of files added for signature and file
> > > > status?
> > > > >> >> > > > > > >      Signature will happen after 120 sec of last fd
> > > > closure,
> > > > >> >> as
> > > > >> >> > > said
> > > > >> >> > > > > above.
> > > > >> >> > > > > > >      There is not status command which tracks the
> > > > signature
> > > > >> >> of the
> > > > >> >> > > > > files.
> > > > >> >> > > > > > >      But there is bitrot status command which
> tracks
> > > the
> > > > >> >> number of
> > > > >> >> > > > > files
> > > > >> >> > > > > > >      scrubbed.
> > > > >> >> > > > > > >
> > > > >> >> > > > > > >      #gluster vol bitrot <volname> scrub status
> > > > >> >> > > > > > >
> > > > >> >> > > > > > >
> > > > >> >> > > > > > > > 3. where i can confirm that all the files in the
> > > brick
> > > > are
> > > > >> >> bitrot
> > > > >> >> > > > > signed?
> > > > >> >> > > > > > >
> > > > >> >> > > > > > >      As said, signing information of all the files
> is
> > > not
> > > > >> >> tracked.
> > > > >> >> > > > > > >
> > > > >> >> > > > > > > > 4. is there any file read size limit in bitrot?
> > > > >> >> > > > > > >
> > > > >> >> > > > > > >      I didn't get. Could you please elaborate this
> ?
> > > > >> >> > > > > > >
> > > > >> >> > > > > > > > 5. options for tuning bitrot for faster signing
> of
> > > > files?
> > > > >> >> > > > > > >
> > > > >> >> > > > > > >      Bitrot feature is mainly to detect silent
> > > corruption
> > > > >> >> > > (bitflips) of
> > > > >> >> > > > > > > files due to long
> > > > >> >> > > > > > >      term storage. Hence the default is 120 sec of
> > > last fd
> > > > >> >> > > closure, the
> > > > >> >> > > > > > > signing happens.
> > > > >> >> > > > > > >      But there is a tune able which can change the
> > > default
> > > > >> >> 120 sec
> > > > >> >> > > but
> > > > >> >> > > > > > > that's only for
> > > > >> >> > > > > > >      testing purposes and we don't recommend it.
> > > > >> >> > > > > > >
> > > > >> >> > > > > > >       gluster vol get master features.expiry-time
> > > > >> >> > > > > > >
> > > > >> >> > > > > > >      For testing purposes, you can change this
> default
> > > and
> > > > >> >> test.
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > > Thanks
> > > > >> >> > > > > > > > Amudhan
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > >
> > > > >> >> > > > > > > > _______________________________________________
> > > > >> >> > > > > > > > Gluster-users mailing list
> > > > >> >> > > > > > > > Gluster-users at gluster.org
> > > > >> >> > > > > > > > http://www.gluster.org/
> > > mailman/listinfo/gluster-users
> > > > >> >> > > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > >
> > > > >> >> > > >
> > > > >> >> > >
> > > > >> >> >
> > > > >> >>
> > > > >> >
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160922/e34cdd5f/attachment.html>


More information about the Gluster-users mailing list