[Gluster-devel] Rebalance improvement design

Benjamin Turner bennyturns at gmail.com
Mon May 4 15:28:13 UTC 2015


I see:

#define GF_DECIDE_DEFRAG_THROTTLE_COUNT(throttle_count, conf) {         \
                                                                        \
                throttle_count = MAX ((get_nprocs() - 4), 4);
  \
                                                                        \
                if (!strcmp (conf->dthrottle, "lazy"))                  \
                        conf->defrag->rthcount = 1;                     \
                                                                        \
                if (!strcmp (conf->dthrottle, "normal"))                \
                        conf->defrag->rthcount = (throttle_count / 2);  \
                                                                        \
                if (!strcmp (conf->dthrottle, "aggressive"))            \
                        conf->defrag->rthcount = throttle_count;  \

So aggressive will give us the default of (20 + 16), normal is that divided
by 2, and lazy is 1, is that correct?  If so that is what I was looking to
see.  The only other thing I can think of here is making the tunible a
number like event threads, but I like this.  IDK if I saw it documented but
if its not we should note this in help.

Also to note, the old time was 98500.00 the new one is 55088.00, that is a
44% improvement!

-b


On Mon, May 4, 2015 at 9:06 AM, Susant Palai <spalai at redhat.com> wrote:

> Ben,
>     On no. of threads:
>      Sent throttle patch here:http://review.gluster.org/#/c/10526/ to
> limit thread numbers[Not merged]. The rebalance process in current model
> spawns 20 threads and in addition to that there will be a max 16 syncop
> threads.
>
>     Crash:
>      The crash should be fixed by this:
> http://review.gluster.org/#/c/10459/.
>
>      Rebalance time taken is a factor of number of files and their size.
> If the frequency of files getting added to the global queue[on which the
> migrator threads act] is higher, faster will be the rebalance. I guess here
> we are seeing the effect of local crawl mostly as only 81GB is migrated out
> of 500GB.
>
> Thanks,
> Susant
>
> ----- Original Message -----
> > From: "Benjamin Turner" <bennyturns at gmail.com>
> > To: "Vijay Bellur" <vbellur at redhat.com>
> > Cc: "Gluster Devel" <gluster-devel at gluster.org>
> > Sent: Monday, May 4, 2015 5:18:13 PM
> > Subject: Re: [Gluster-devel] Rebalance improvement design
> >
> > Thanks Vijay! I forgot to upgrade the kernel(thinp 6.6 perf bug gah)
> before I
> > created this data set, so its a bit smaller:
> >
> > total threads = 16
> > total files = 7,060,700 (64 kb files, 100 files per dir)
> > total data = 430.951 GB
> > 88.26% of requested files processed, minimum is 70.00
> > 10101.355737 sec elapsed time
> > 698.985382 files/sec
> > 698.985382 IOPS
> > 43.686586 MB/sec
> >
> > I updated everything and ran the rebalanace on
> > glusterfs-3.8dev-0.107.git275f724.el6.x86_64.:
> >
> > [root at gqas001 ~]# gluster v rebalance testvol status
> > Node Rebalanced-files size scanned failures skipped status run time in
> secs
> > --------- ----------- ----------- ----------- ----------- -----------
> > ------------ --------------
> > localhost 1327346 81.0GB 3999140 0 0 completed 55088.00
> > gqas013.sbu.lab.eng.bos.redhat.com 0 0Bytes 1 0 0 completed 26070.00
> > gqas011.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00
> > gqas014.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00
> > gqas016.sbu.lab.eng.bos.redhat.com 1325857 80.9GB 4000865 0 0 completed
> > 55088.00
> > gqas015.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00
> > volume rebalance: testvol: success:
> >
> >
> > A couple observations:
> >
> > I am seeing lots of threads / processes running:
> >
> > [root at gqas001 ~]# ps -eLf | grep glu | wc -l
> > 96 <- 96 gluster threads
> > [root at gqas001 ~]# ps -eLf | grep rebal | wc -l
> > 36 <- 36 rebal threads.
> >
> > Is this tunible? Is there a use case where we would need to limit this?
> Just
> > curious, how did we arrive at 36 rebal threads?
> >
> > # cat /var/log/glusterfs/testvol-rebalance.log | wc -l
> > 4,577,583
> > [root at gqas001 ~]# ll /var/log/glusterfs/testvol-rebalance.log -h
> > -rw------- 1 root root 1.6G May 3 12:29
> > /var/log/glusterfs/testvol-rebalance.log
> >
> > :) How big is this going to get when I do the 10-20 TB? I'll keep tabs on
> > this, my default test setup only has:
> >
> > [root at gqas001 ~]# df -h
> > Filesystem Size Used Avail Use% Mounted on
> > /dev/mapper/vg_gqas001-lv_root 50G 4.8G 42G 11% /
> > tmpfs 24G 0 24G 0% /dev/shm
> > /dev/sda1 477M 65M 387M 15% /boot
> > /dev/mapper/vg_gqas001-lv_home 385G 71M 366G 1% /home
> > /dev/mapper/gluster_vg-lv_bricks 9.5T 219G 9.3T 3% /bricks
> >
> > Next run I want to fill up a 10TB cluster and double the # of bricks to
> > simulate running out of space doubling capacity. Any other fixes or
> changes
> > that need to go in before I try a larger data set? Before that I may run
> my
> > performance regression suite against a system while a rebal is in
> progress
> > and check how it affects performance. I'll turn both these cases into
> perf
> > regression tests that I run with iozone smallfile and such, any other use
> > cases I should add? Should I add hard / soft links / whatever else tot he
> > data set?
> >
> > -b
> >
> >
> > On Sun, May 3, 2015 at 11:48 AM, Vijay Bellur < vbellur at redhat.com >
> wrote:
> >
> >
> > On 05/01/2015 10:23 AM, Benjamin Turner wrote:
> >
> >
> > Ok I have all my data created and I just started the rebalance. One
> > thing to not in the client log I see the following spamming:
> >
> > [root at gqac006 ~]# cat /var/log/glusterfs/gluster-mount-.log | wc -l
> > 394042
> >
> > [2015-05-01 00:47:55.591150] I [MSGID: 109036]
> > [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht:
> > Setting layout of
> > /file_dstdir/
> > gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006
> > <
> http://gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006
> >
> > with [Subvol_name: testvol-replicate-0, Err: -1 , Start: 0 , Stop:
> > 2141429669 ], [Subvol_name: testvol-replicate-1, Err: -1 , Start:
> > 2141429670 , Stop: 4294967295 ],
> > [2015-05-01 00:47:55.596147] I
> > [dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-testvol-dht:
> > chunk size = 0xffffffff / 19920276 = 0xd7
> > [2015-05-01 00:47:55.596177] I
> > [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
> > assigning range size 0x7fa39fa6 to testvol-replicate-1
> >
> >
> > I also noticed the same set of excessive logs in my tests. Have sent
> across a
> > patch [1] to address this problem.
> >
> > -Vijay
> >
> > [1] http://review.gluster.org/10281
> >
> >
> >
> >
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150504/ce9e7dae/attachment.html>


More information about the Gluster-devel mailing list