[Gluster-devel] Rebalance improvement design

Sun May 3 15:43:30 UTC 2015

Current run is segfault free(yay!) so far:

[root at gqas001 ~]# gluster v rebalance testvol status
                                    Node Rebalanced-files          size
  scanned      failures       skipped               status   run time in
secs
                               ---------      -----------   -----------
-----------   -----------   -----------         ------------
--------------
                               localhost           893493        54.5GB
  2692286             0             0          in progress
38643.00
      gqas013.sbu.lab.eng.bos.redhat.com                0        0Bytes
        1             0             0            completed
26070.00
      gqas011.sbu.lab.eng.bos.redhat.com                0        0Bytes
        0             0             0               failed
0.00
      gqas014.sbu.lab.eng.bos.redhat.com                0        0Bytes
        0             0             0               failed
0.00
      gqas016.sbu.lab.eng.bos.redhat.com           892110        54.4GB
  2692295             0             0          in progress
38643.00
      gqas015.sbu.lab.eng.bos.redhat.com                0        0Bytes
        0             0             0               failed
0.00
volume rebalance: testvol: success:

The beseline ran for 98,500.00 seconds.  This one is at 38,643.00(1/3 the
number of seconds) with 54 GB transferred so far.  The same data set last
run transferred 81 GB so at 54 GBs we are 66% there.  By my estimations we
should run for ~10,000-20,000 more seconds which would give us a 40-50%
improvement!  Lets see how it finishes out :)

Any idea why I am getting the "failed" for three of them?  This has been
consistent across each run I have tried.

-b

On Fri, May 1, 2015 at 3:05 AM, Ravishankar N <ravishankar at redhat.com>
wrote:

>  I sent  a fix <http://review.gluster.org/#/c/10478/> but abandoned it
> since Susant (CC'ed) has already sent one
> http://review.gluster.org/#/c/10459/
> I think it needs re-submission, but more review-eyes are welcome.
> -Ravi
>
>
> On 05/01/2015 12:18 PM, Benjamin Turner wrote:
>
> There was a segfault on gqas001, have a look when you get a sec:
>
>  Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id
> rebalance/testvol --xlator-option'.
> Program terminated with signal 11, Segmentation fault.
> #0  gf_defrag_get_entry (this=0x7f26f8011180, defrag=0x7f26f8031ef0,
> loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2032
> 2032                GF_FREE (tmp_container->parent_loc);
> (gdb) bt
> #0  gf_defrag_get_entry (this=0x7f26f8011180, defrag=0x7f26f8031ef0,
> loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2032
> #1  gf_defrag_process_dir (this=0x7f26f8011180, defrag=0x7f26f8031ef0,
> loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2207
> #2  0x00007f26fdae1eb8 in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbbfd0, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2299
> #3  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc200, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #4  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc430, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #5  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc660, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #6  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc890, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #7  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbcac0, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #8  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbccf0, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #9  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180,
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbcf60, fix_layout=0x7f2707874b5c,
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #10 0x00007f26fdae2524 in gf_defrag_start_crawl (data=0x7f26f8011180) at
> dht-rebalance.c:2599
> #11 0x00007f2709024f62 in synctask_wrap (old_task=<value optimized out>)
> at syncop.c:375
> #12 0x0000003648c438f0 in ?? () from /lib64/libc-2.12.so
> #13 0x0000000000000000 in ?? ()
>
>
> On Fri, May 1, 2015 at 12:53 AM, Benjamin Turner <bennyturns at gmail.com>
> wrote:
>
>> Ok I have all my data created and I just started the rebalance.  One
>> thing to not in the client log I see the following spamming:
>>
>>  [root at gqac006 ~]# cat /var/log/glusterfs/gluster-mount-.log | wc -l
>> 394042
>>
>>  [2015-05-01 00:47:55.591150] I [MSGID: 109036]
>> [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht:
>> Setting layout of /file_dstdir/
>> gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006 with
>> [Subvol_name: testvol-replicate-0, Err: -1 , Start: 0 , Stop: 2141429669 ],
>> [Subvol_name: testvol-replicate-1, Err: -1 , Start: 2141429670 , Stop:
>> 4294967295 ],
>> [2015-05-01 00:47:55.596147] I
>> [dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-testvol-dht:
>> chunk size = 0xffffffff / 19920276 = 0xd7
>> [2015-05-01 00:47:55.596177] I
>> [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
>> assigning range size 0x7fa39fa6 to testvol-replicate-1
>> [2015-05-01 00:47:55.596189] I
>> [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
>> assigning range size 0x7fa39fa6 to testvol-replicate-0
>> [2015-05-01 00:47:55.597081] I [MSGID: 109036]
>> [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht:
>> Setting layout of /file_dstdir/
>> gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_005 with
>> [Subvol_name: testvol-replicate-0, Err: -1 , Start: 2141429670 , Stop:
>> 4294967295 ], [Subvol_name: testvol-replicate-1, Err: -1 , Start: 0 , Stop:
>> 2141429669 ],
>> [2015-05-01 00:47:55.601853] I
>> [dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-testvol-dht:
>> chunk size = 0xffffffff / 19920276 = 0xd7
>> [2015-05-01 00:47:55.601882] I
>> [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
>> assigning range size 0x7fa39fa6 to testvol-replicate-1
>> [2015-05-01 00:47:55.601895] I
>> [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht:
>> assigning range size 0x7fa39fa6 to testvol-replicate-0
>>
>>  Just to confirm the patch is
>> in, glusterfs-3.8dev-0.71.gita7f8482.el6.x86_64.  Correct?
>>
>>  Here is the info on the data set:
>>
>>  hosts in test : ['gqac006.sbu.lab.eng.bos.redhat.com', '
>> gqas003.sbu.lab.eng.bos.redhat.com']
>> top test directory(s) : ['/gluster-mount']
>> peration : create
>> files/thread : 500000
>> threads : 8
>> record size (KB, 0 = maximum) : 0
>> file size (KB) : 64
>> file size distribution : fixed
>> files per dir : 100
>> dirs per dir : 10
>>  total threads = 16
>> total files = 7222600
>> total data =   440.833 GB
>>  90.28% of requested files processed, minimum is  70.00
>> 8107.852862 sec elapsed time
>> 890.815377 files/sec
>> 890.815377 IOPS
>> 55.675961 MB/sec
>>
>>  Here is the rebalance run after about 5 or so minutes:
>>
>>  [root at gqas001 ~]# gluster v rebalance testvol status
>>                                     Node Rebalanced-files          size
>>     scanned      failures       skipped               status   run time in
>> secs
>>                                ---------      -----------   -----------
>> -----------   -----------   -----------         ------------
>> --------------
>>                                 localhost            32203         2.0GB
>>        120858             0          5184          in progress
>>  1294.00
>>       gqas011.sbu.lab.eng.bos.redhat.com                0        0Bytes
>>             0             0             0               failed
>>   0.00
>>       gqas016.sbu.lab.eng.bos.redhat.com             9364       585.2MB
>>         53121             0             0          in progress
>>  1294.00
>>       gqas013.sbu.lab.eng.bos.redhat.com                0        0Bytes
>>         14750             0             0          in progress
>>  1294.00
>>       gqas014.sbu.lab.eng.bos.redhat.com                0        0Bytes
>>             0             0             0               failed
>>   0.00
>>       gqas015.sbu.lab.eng.bos.redhat.com                0        0Bytes
>>        196382             0             0          in progress
>>  1294.00
>> volume rebalance: testvol: success:
>>
>>  The hostnames are there if you want to poke around.  I had a problem
>> with one of the added systems being on a different version of glusterfs so
>> I had to update everything to glusterfs-3.8dev-0.99.git7d7b80e.el6.x86_64,
>> remove the bricks I just added, and add them back.  Something may have went
>> wrong in that process but I thought I did everything correctly.  I'll start
>> fresh tomorrow.  I figured I'd let this run over night.
>>
>>  -b
>>
>>
>>
>>
>> On Wed, Apr 29, 2015 at 9:48 PM, Benjamin Turner <bennyturns at gmail.com>
>> wrote:
>>
>>> Sweet!  Here is the baseline:
>>>
>>>  [root at gqas001 ~]# gluster v rebalance testvol status
>>>                                      Node Rebalanced-files
>>>  size       scanned      failures       skipped               status   run
>>> time in secs
>>>                                ---------      -----------   -----------
>>>   -----------   -----------   -----------         ------------
>>> --------------
>>>                                 localhost          1328575
>>>  81.1GB       9402953             0             0            completed
>>>       98500.00
>>>       gqas012.sbu.lab.eng.bos.redhat.com                0        0Bytes
>>>       8000011             0             0            completed
>>> 51982.00
>>>       gqas003.sbu.lab.eng.bos.redhat.com                0        0Bytes
>>>       8000011             0             0            completed
>>> 51982.00
>>>       gqas004.sbu.lab.eng.bos.redhat.com          1326290        81.0GB
>>>       9708625             0             0            completed
>>> 98500.00
>>>       gqas013.sbu.lab.eng.bos.redhat.com                0        0Bytes
>>>       8000011             0             0            completed
>>> 51982.00
>>>       gqas014.sbu.lab.eng.bos.redhat.com                0        0Bytes
>>>       8000011             0             0            completed
>>> 51982.00
>>> volume rebalance: testvol: success:
>>>
>>>  I'll have a run on the patch started tomorrow.
>>>
>>>  -b
>>>
>>> On Wed, Apr 29, 2015 at 12:51 PM, Nithya Balachandran <
>>> nbalacha at redhat.com> wrote:
>>>
>>>>
>>>> Doh my mistake, I thought it was merged.  I was just running with the
>>>> upstream 3.7 daily.  Can I use this run as my baseline and then I can
>>>> run
>>>> next time on the patch to show the % improvement?  I'll wipe everything
>>>> and
>>>> try on the patch, any idea when it will be merged?
>>>>
>>>> Yes, it would be very useful to have this run as the baseline. The
>>>> patch has just been merged in master. It should be backported to 3.7 in a
>>>> day or so.
>>>>
>>>> Regards,
>>>> Nithya
>>>>
>>>>
>>>> > > > >
>>>> > > > > >
>>>> > > > > > On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran
>>>> > > > > > <nbalacha at redhat.com>
>>>> > > > > > wrote:
>>>> > > > > >
>>>> > > > > > > That sounds great. Thanks.
>>>> > > > > > >
>>>> > > > > > > Regards,
>>>> > > > > > > Nithya
>>>> > > > > > >
>>>> > > > > > > ----- Original Message -----
>>>> > > > > > > From: "Benjamin Turner" <bennyturns at gmail.com>
>>>> > > > > > > To: "Nithya Balachandran" <nbalacha at redhat.com>
>>>> > > > > > > Cc: "Susant Palai" <spalai at redhat.com>, "Gluster Devel" <
>>>> > > > > > > gluster-devel at gluster.org>
>>>> > > > > > > Sent: Wednesday, 22 April, 2015 12:14:14 AM
>>>> > > > > > > Subject: Re: [Gluster-devel] Rebalance improvement design
>>>> > > > > > >
>>>> > > > > > > I am setting up a test env now, I'll have some feedback for
>>>> you
>>>> > this
>>>> > > > > > > week.
>>>> > > > > > >
>>>> > > > > > > -b
>>>> > > > > > >
>>>> > > > > > > On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran
>>>> > > > > > > <nbalacha at redhat.com
>>>> > > > > > > >
>>>> > > > > > > wrote:
>>>> > > > > > >
>>>> > > > > > > > Hi Ben,
>>>> > > > > > > >
>>>> > > > > > > > Did you get a chance to try this out?
>>>> > > > > > > >
>>>> > > > > > > > Regards,
>>>> > > > > > > > Nithya
>>>> > > > > > > >
>>>> > > > > > > > ----- Original Message -----
>>>> > > > > > > > From: "Susant Palai" <spalai at redhat.com>
>>>> > > > > > > > To: "Benjamin Turner" <bennyturns at gmail.com>
>>>> > > > > > > > Cc: "Gluster Devel" <gluster-devel at gluster.org>
>>>> > > > > > > > Sent: Monday, April 13, 2015 9:55:07 AM
>>>> > > > > > > > Subject: Re: [Gluster-devel] Rebalance improvement design
>>>> > > > > > > >
>>>> > > > > > > > Hi Ben,
>>>> > > > > > > >   Uploaded a new patch here:
>>>> > http://review.gluster.org/#/c/9657/.
>>>> > > > > > > >   We
>>>> > > > > > > >   can
>>>> > > > > > > > start perf test on it. :)
>>>> > > > > > > >
>>>> > > > > > > > Susant
>>>> > > > > > > >
>>>> > > > > > > > ----- Original Message -----
>>>> > > > > > > > From: "Susant Palai" <spalai at redhat.com>
>>>> > > > > > > > To: "Benjamin Turner" <bennyturns at gmail.com>
>>>> > > > > > > > Cc: "Gluster Devel" <gluster-devel at gluster.org>
>>>> > > > > > > > Sent: Thursday, 9 April, 2015 3:40:09 PM
>>>> > > > > > > > Subject: Re: [Gluster-devel] Rebalance improvement design
>>>> > > > > > > >
>>>> > > > > > > > Thanks Ben. RPM is not available and I am planning to
>>>> refresh
>>>> > the
>>>> > > > > > > > patch
>>>> > > > > > > in
>>>> > > > > > > > two days with some more regression fixes. I think we can
>>>> run
>>>> > the
>>>> > > > > > > > tests
>>>> > > > > > > post
>>>> > > > > > > > that. Any larger data-set will be good(say 3 to 5 TB).
>>>> > > > > > > >
>>>> > > > > > > > Thanks,
>>>> > > > > > > > Susant
>>>> > > > > > > >
>>>> > > > > > > > ----- Original Message -----
>>>> > > > > > > > From: "Benjamin Turner" <bennyturns at gmail.com>
>>>> > > > > > > > To: "Vijay Bellur" <vbellur at redhat.com>
>>>> > > > > > > > Cc: "Susant Palai" <spalai at redhat.com>, "Gluster Devel" <
>>>> > > > > > > > gluster-devel at gluster.org>
>>>> > > > > > > > Sent: Thursday, 9 April, 2015 2:10:30 AM
>>>> > > > > > > > Subject: Re: [Gluster-devel] Rebalance improvement design
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > I have some rebalance perf regression stuff I have been
>>>> > working on,
>>>> > > > > > > > is
>>>> > > > > > > > there an RPM with these patches anywhere so that I can
>>>> try it
>>>> > on my
>>>> > > > > > > > systems? If not I'll just build from:
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > git fetch git:// review.gluster.org/glusterfs
>>>> > > > > > > > refs/changes/57/9657/8
>>>> > > > > > > > &&
>>>> > > > > > > > git cherry-pick FETCH_HEAD
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > I will have _at_least_ 10TB of storage, how many TBs of
>>>> data
>>>> > should
>>>> > > > > > > > I
>>>> > > > > > > > run
>>>> > > > > > > > with?
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > -b
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur <
>>>> > vbellur at redhat.com >
>>>> > > > > > > wrote:
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > On 04/07/2015 03:08 PM, Susant Palai wrote:
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > Here is one test performed on a 300GB data set and around
>>>> > 100%(1/2
>>>> > > > > > > > the
>>>> > > > > > > > time) improvement was seen.
>>>> > > > > > > >
>>>> > > > > > > > [root at gprfs031 ~]# gluster v i
>>>> > > > > > > >
>>>> > > > > > > > Volume Name: rbperf
>>>> > > > > > > > Type: Distribute
>>>> > > > > > > > Volume ID: 35562662-337e-4923-b862- d0bbb0748003
>>>> > > > > > > > Status: Started
>>>> > > > > > > > Number of Bricks: 4
>>>> > > > > > > > Transport-type: tcp
>>>> > > > > > > > Bricks:
>>>> > > > > > > > Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1
>>>> > > > > > > > Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1
>>>> > > > > > > > Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1
>>>> > > > > > > > Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > Added server 32 and started rebalance force.
>>>> > > > > > > >
>>>> > > > > > > > Rebalance stat for new changes:
>>>> > > > > > > > [root at gprfs031 ~]# gluster v rebalance rbperf status
>>>> > > > > > > > Node Rebalanced-files size scanned failures skipped
>>>> status run
>>>> > time
>>>> > > > > > > > in
>>>> > > > > > > secs
>>>> > > > > > > > --------- ----------- ----------- ----------- -----------
>>>> > > > > > > > -----------
>>>> > > > > > > > ------------ --------------
>>>> > > > > > > > localhost 74639 36.1GB 297319 0 0 completed 1743.00
>>>> > > > > > > > 172.17.40.30 67512 33.5GB 269187 0 0 completed 1395.00
>>>> > > > > > > > gprfs029-10ge 79095 38.8GB 284105 0 0 completed 1559.00
>>>> > > > > > > > gprfs032-10ge 0 0Bytes 0 0 0 completed 402.00
>>>> > > > > > > > volume rebalance: rbperf: success:
>>>> > > > > > > >
>>>> > > > > > > > Rebalance stat for old model:
>>>> > > > > > > > [root at gprfs031 ~]# gluster v rebalance rbperf status
>>>> > > > > > > > Node Rebalanced-files size scanned failures skipped
>>>> status run
>>>> > time
>>>> > > > > > > > in
>>>> > > > > > > secs
>>>> > > > > > > > --------- ----------- ----------- ----------- -----------
>>>> > > > > > > > -----------
>>>> > > > > > > > ------------ --------------
>>>> > > > > > > > localhost 86493 42.0GB 634302 0 0 completed 3329.00
>>>> > > > > > > > gprfs029-10ge 94115 46.2GB 687852 0 0 completed 3328.00
>>>> > > > > > > > gprfs030-10ge 74314 35.9GB 651943 0 0 completed 3072.00
>>>> > > > > > > > gprfs032-10ge 0 0Bytes 594166 0 0 completed 1943.00
>>>> > > > > > > > volume rebalance: rbperf: success:
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > This is interesting. Thanks for sharing & well done!
>>>> Maybe we
>>>> > > > > > > > should
>>>> > > > > > > > attempt a much larger data set and see how we fare there
>>>> :).
>>>> > > > > > > >
>>>> > > > > > > > Regards,
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > Vijay
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > > ______________________________ _________________
>>>> > > > > > > > Gluster-devel mailing list
>>>> > > > > > > > Gluster-devel at gluster.org
>>>> > > > > > > > http://www.gluster.org/ mailman/listinfo/gluster-devel
>>>> > > > > > > >
>>>> > > > > > > > _______________________________________________
>>>> > > > > > > > Gluster-devel mailing list
>>>> > > > > > > > Gluster-devel at gluster.org
>>>> > > > > > > > http://www.gluster.org/mailman/listinfo/gluster-devel
>>>> > > > > > > > _______________________________________________
>>>> > > > > > > > Gluster-devel mailing list
>>>> > > > > > > > Gluster-devel at gluster.org
>>>> > > > > > > > http://www.gluster.org/mailman/listinfo/gluster-devel
>>>> > > > > > > >
>>>> > > > > > >
>>>> > > > > >
>>>> > > > > _______________________________________________
>>>> > > > > Gluster-devel mailing list
>>>> > > > > Gluster-devel at gluster.org
>>>> > > > > http://www.gluster.org/mailman/listinfo/gluster-devel
>>>> > > > >
>>>> > > > _______________________________________________
>>>> > > > Gluster-devel mailing list
>>>> > > > Gluster-devel at gluster.org
>>>> > > > http://www.gluster.org/mailman/listinfo/gluster-devel
>>>> > > >
>>>> > >
>>>> >
>>>>
>>>
>>>
>>
>
>
> _______________________________________________
> Gluster-devel mailing listGluster-devel at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150503/651360ef/attachment-0001.html>