[Gluster-users] Fwd: disperse heal speed up

Pranith Kumar Karampuri pkarampu at redhat.com
Fri Aug 12 13:07:56 UTC 2016


On Fri, Aug 12, 2016 at 6:36 PM, Serkan Çoban <cobanserkan at gmail.com> wrote:

> I simulated a disk failure, there are 5 top level directories in
> gluster and I executed the find command 5 parallel from the same
> client.
> 24 hours passed and 500GB of 700GB data was healed. I think it will
> complete in 36 hours. Before it was 144 hours.
> What I would like to ask is, can I further increase parallel execution
> of find by giving sub-folders to it?
>

Yes you can. Let me know if you see something you don't like and we can
work to address it.


> Assume below directory structure:
> /mnt/gluster/a/x1
> /mnt/gluster/a/x2
> /mnt/gluster/b/x1
> /mnt/gluster/b/x2
> /mnt/gluster/c/x1
> /mnt/gluster/c/x2
> /mnt/gluster/d/x1
> /mnt/gluster/d/x2
> /mnt/gluster/e/x1
> /mnt/gluster/e/x2
>
> Can I run 10 different find commands from 10 different clients to
> speed up heal performance?
>
> From Client1:
> find /mnt/gluster/a/x1 -d -exec getfattr -h -n trusted.ec.heal {} \;
>
> From Client2:
> find /mnt/gluster/a/x2 -d -exec getfattr -h -n trusted.ec.heal {} \;
> ...
> ...
>
> From Client10:
> find /mnt/gluster/e/x2 -d -exec getfattr -h -n trusted.ec.heal {} \;
>
> Serkan
>
> On Thu, Aug 11, 2016 at 11:49 AM, Serkan Çoban <cobanserkan at gmail.com>
> wrote:
> > Heal completed but I will try this by simulating a disk fail in
> > cluster and reply to you. Thanks for the help.
> >
> > On Thu, Aug 11, 2016 at 9:52 AM, Pranith Kumar Karampuri
> > <pkarampu at redhat.com> wrote:
> >>
> >>
> >> On Fri, Aug 5, 2016 at 8:37 PM, Serkan Çoban <cobanserkan at gmail.com>
> wrote:
> >>>
> >>> Hi again,
> >>>
> >>> I am seeing the above situation in production environment now.
> >>> One disk on one of my servers broken. I killed the brick process,
> >>> replace the disk, mount it and then I do a gluster v start force.
> >>>
> >>> For a 24 hours period  after replacing disks I see below gluster v
> >>> heal info count increased until 200.000
> >>>
> >>> gluster v heal v0 info | grep "Number of entries" | grep -v "Number of
> >>> entries: 0"
> >>> Number of entries: 205117
> >>> Number of entries: 205231
> >>> ...
> >>> ...
> >>> ...
> >>>
> >>> For about 72 hours It decreased to 40K, and it is going very slowly
> right
> >>> now.
> >>> What I am observing is very very slow heal speed. There is no errors
> >>> in brick logs.
> >>> There was 900GB data in broken disk and now I see 200GB healed after
> >>> 96 hours after replacing disk.
> >>> There are below warnings in glustershd.log but I think they are
> harmless.
> >>>
> >>> W [ec_combine.c:866:ec_combine_check] 0-v0-disperse-56: Mismatching
> >>> xdata in answers of LOOKUP
> >>> W [ec_common.c:116:ec_check_status] 0-v0-disperse-56: Operation failed
> >>> on some subvolumes (up=FFFFF, mask=FFFFF, remaining=0, good=FFFF7,
> >>> bad=8)
> >>> W [ec_common.c:71:ec_heal_report] 0-v0-disperse-56: Heal failed
> >>> [invalid argument]
> >>>
> >>> I tried turning on performance.client-io-threads but it did not
> >>> changed anything.
> >>> For 900GB data It will take nearly 8 days to heal. What can I do?
> >>
> >>
> >> Sorry for the delay in response, do you still have this problem?
> >> You can trigger heals using the following command:
> >>
> >> find <dir-you-are-interested> -d -exec getfattr -h -n trusted.ec.heal
> {} \;
> >>
> >> If you have 10 top level directories may be you can spawn 10 such
> processes.
> >>
> >>
> >>>
> >>>
> >>> Serkan
> >>>
> >>>
> >>>
> >>> On Fri, Apr 15, 2016 at 1:28 PM, Serkan Çoban <cobanserkan at gmail.com>
> >>> wrote:
> >>> > 100TB is newly created files when brick is down.I rethink the
> >>> > situation and realized that I reformatted all the bricks in case 1 so
> >>> > write speed limit is 26*100MB/disk
> >>> > In case 2 I just reformatted one brick so write speed limited to
> >>> > 100MB/disk...I will repeat the tests using one brick in both cases
> >>> > once with reformat, and once with just killing brick process...
> >>> > Thanks for reply..
> >>> >
> >>> > On Fri, Apr 15, 2016 at 9:27 AM, Xavier Hernandez
> >>> > <xhernandez at datalab.es> wrote:
> >>> >> Hi Serkan,
> >>> >>
> >>> >> sorry for the delay, I'm a bit busy lately.
> >>> >>
> >>> >> On 13/04/16 13:59, Serkan Çoban wrote:
> >>> >>>
> >>> >>> Hi Xavier,
> >>> >>>
> >>> >>> Can you help me about the below issue? How can I increase the
> disperse
> >>> >>> heal speed?
> >>> >>
> >>> >>
> >>> >> It seems weird. Is there any related message in the logs ?
> >>> >>
> >>> >> In this particular test, are the 100TB modified files or newly
> created
> >>> >> files
> >>> >> while the brick was down ?
> >>> >>
> >>> >> How many files have been modified ?
> >>> >>
> >>> >>> Also I would be grateful if you have detailed documentation about
> >>> >>> disperse
> >>> >>> heal,
> >>> >>> why heal happens on disperse volume, how it is triggered? Which
> nodes
> >>> >>> participate in heal process? Any client interaction?
> >>> >>
> >>> >>
> >>> >> Heal process is basically the same used for replicate. There are two
> >>> >> ways to
> >>> >> trigger a self-heal:
> >>> >>
> >>> >> * when an inconsistency is detected, the client initiates a
> background
> >>> >> self-heal of the inode
> >>> >>
> >>> >> * the self-heal daemon scans the lists of modified files created by
> the
> >>> >> index xlator when a modification is made while some node is down.
> All
> >>> >> these
> >>> >> files are self-healed.
> >>> >>
> >>> >> Xavi
> >>> >>
> >>> >>
> >>> >>>
> >>> >>> Serkan
> >>> >>>
> >>> >>>
> >>> >>> ---------- Forwarded message ----------
> >>> >>> From: Serkan Çoban <cobanserkan at gmail.com>
> >>> >>> Date: Fri, Apr 8, 2016 at 5:46 PM
> >>> >>> Subject: disperse heal speed up
> >>> >>> To: Gluster Users <gluster-users at gluster.org>
> >>> >>>
> >>> >>>
> >>> >>> Hi,
> >>> >>>
> >>> >>> I am testing heal speed of disperse volume and what I see is
> 5-10MB/s
> >>> >>> per
> >>> >>> node.
> >>> >>> I increased disperse.background-heals to 32 and
> >>> >>> disperse.heal-wait-qlength to 256, but still no difference.
> >>> >>> One thing I noticed is that, when I kill a brick process, reformat
> it
> >>> >>> and restart it heal speed is nearly 20x (200MB/s/node)
> >>> >>>
> >>> >>> But when I kill the brick, then write 100TB data, and start brick
> >>> >>> afterwords heal is slow (5-10MB/s/node)
> >>> >>>
> >>> >>> What is the difference between two scenarios? Why one heal is slow
> and
> >>> >>> other is fast? How can I increase disperse heal speed? Should I
> >>> >>> increase thread count to 128 or 256? I am on 78x(16+4) disperse
> volume
> >>> >>> and my servers are pretty strong (2x14 cores with 512GB ram, each
> node
> >>> >>> has 26x8TB disks)
> >>> >>>
> >>> >>> Gluster version is 3.7.10.
> >>> >>>
> >>> >>> Thanks,
> >>> >>> Serkan
> >>> >>>
> >>> >>
> >>> _______________________________________________
> >>> Gluster-users mailing list
> >>> Gluster-users at gluster.org
> >>> http://www.gluster.org/mailman/listinfo/gluster-users
> >>
> >>
> >>
> >>
> >> --
> >> Pranith
>



-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160812/763c7a2f/attachment.html>


More information about the Gluster-users mailing list