[Gluster-devel] Rebalance improvement design

Ravishankar N ravishankar at redhat.com
Fri May 1 07:05:47 UTC 2015


I sent  a fix <http://review.gluster.org/#/c/10478/> but abandoned it 
since Susant (CC'ed) has already sent one 
http://review.gluster.org/#/c/10459/
I think it needs re-submission, but more review-eyes are welcome.
-Ravi

On 05/01/2015 12:18 PM, Benjamin Turner wrote:
> There was a segfault on gqas001, have a look when you get a sec:
>
> Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id 
> rebalance/testvol --xlator-option'.
> Program terminated with signal 11, Segmentation fault.
> #0  gf_defrag_get_entry (this=0x7f26f8011180, defrag=0x7f26f8031ef0, 
> loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2032
> 2032           GF_FREE (tmp_container->parent_loc);
> (gdb) bt
> #0  gf_defrag_get_entry (this=0x7f26f8011180, defrag=0x7f26f8031ef0, 
> loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2032
> #1  gf_defrag_process_dir (this=0x7f26f8011180, defrag=0x7f26f8031ef0, 
> loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2207
> #2  0x00007f26fdae1eb8 in gf_defrag_fix_layout (this=0x7f26f8011180, 
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbbfd0, fix_layout=0x7f2707874b5c, 
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2299
> #3  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, 
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc200, fix_layout=0x7f2707874b5c, 
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #4  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, 
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc430, fix_layout=0x7f2707874b5c, 
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #5  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, 
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc660, fix_layout=0x7f2707874b5c, 
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #6  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, 
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbc890, fix_layout=0x7f2707874b5c, 
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #7  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, 
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbcac0, fix_layout=0x7f2707874b5c, 
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #8  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, 
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbccf0, fix_layout=0x7f2707874b5c, 
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #9  0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, 
> defrag=0x7f26f8031ef0, loc=0x7f26f4dbcf60, fix_layout=0x7f2707874b5c, 
> migrate_data=0x7f2707874be8)
>     at dht-rebalance.c:2416
> #10 0x00007f26fdae2524 in gf_defrag_start_crawl (data=0x7f26f8011180) 
> at dht-rebalance.c:2599
> #11 0x00007f2709024f62 in synctask_wrap (old_task=<value optimized 
> out>) at syncop.c:375
> #12 0x0000003648c438f0 in ?? () from /lib64/libc-2.12.so 
> <http://libc-2.12.so>
> #13 0x0000000000000000 in ?? ()
>
>
> On Fri, May 1, 2015 at 12:53 AM, Benjamin Turner <bennyturns at gmail.com 
> <mailto:bennyturns at gmail.com>> wrote:
>
>     Ok I have all my data created and I just started the rebalance. 
>     One thing to not in the client log I see the following spamming:
>
>     [root at gqac006 ~]# cat /var/log/glusterfs/gluster-mount-.log | wc -l
>     394042
>
>     [2015-05-01 00:47:55.591150] I [MSGID: 109036]
>     [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal]
>     0-testvol-dht: Setting layout of
>     /file_dstdir/gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006
>     <http://gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006>
>     with [Subvol_name: testvol-replicate-0, Err: -1 , Start: 0 , Stop:
>     2141429669 ], [Subvol_name: testvol-replicate-1, Err: -1 , Start:
>     2141429670 , Stop: 4294967295 ],
>     [2015-05-01 00:47:55.596147] I
>     [dht-selfheal.c:1587:dht_selfheal_layout_new_directory]
>     0-testvol-dht: chunk size = 0xffffffff / 19920276 = 0xd7
>     [2015-05-01 00:47:55.596177] I
>     [dht-selfheal.c:1626:dht_selfheal_layout_new_directory]
>     0-testvol-dht: assigning range size 0x7fa39fa6 to testvol-replicate-1
>     [2015-05-01 00:47:55.596189] I
>     [dht-selfheal.c:1626:dht_selfheal_layout_new_directory]
>     0-testvol-dht: assigning range size 0x7fa39fa6 to testvol-replicate-0
>     [2015-05-01 00:47:55.597081] I [MSGID: 109036]
>     [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal]
>     0-testvol-dht: Setting layout of
>     /file_dstdir/gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_005
>     <http://gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_005>
>     with [Subvol_name: testvol-replicate-0, Err: -1 , Start:
>     2141429670 , Stop: 4294967295 ], [Subvol_name:
>     testvol-replicate-1, Err: -1 , Start: 0 , Stop: 2141429669 ],
>     [2015-05-01 00:47:55.601853] I
>     [dht-selfheal.c:1587:dht_selfheal_layout_new_directory]
>     0-testvol-dht: chunk size = 0xffffffff / 19920276 = 0xd7
>     [2015-05-01 00:47:55.601882] I
>     [dht-selfheal.c:1626:dht_selfheal_layout_new_directory]
>     0-testvol-dht: assigning range size 0x7fa39fa6 to testvol-replicate-1
>     [2015-05-01 00:47:55.601895] I
>     [dht-selfheal.c:1626:dht_selfheal_layout_new_directory]
>     0-testvol-dht: assigning range size 0x7fa39fa6 to testvol-replicate-0
>
>     Just to confirm the patch is
>     in, glusterfs-3.8dev-0.71.gita7f8482.el6.x86_64. Correct?
>
>     Here is the info on the data set:
>
>     hosts in test : ['gqac006.sbu.lab.eng.bos.redhat.com
>     <http://gqac006.sbu.lab.eng.bos.redhat.com>',
>     'gqas003.sbu.lab.eng.bos.redhat.com
>     <http://gqas003.sbu.lab.eng.bos.redhat.com>']
>     top test directory(s) : ['/gluster-mount']
>     peration : create
>     files/thread : 500000
>     threads : 8
>     record size (KB, 0 = maximum) : 0
>     file size (KB) : 64
>     file size distribution : fixed
>     files per dir : 100
>     dirs per dir : 10
>     total threads = 16
>     total files = 7222600
>     total data =   440.833 GB
>      90.28% of requested files processed, minimum is  70.00
>     8107.852862 sec elapsed time
>     890.815377 files/sec
>     890.815377 IOPS
>     55.675961 MB/sec
>
>     Here is the rebalance run after about 5 or so minutes:
>
>     [root at gqas001 ~]# gluster v rebalance testvol status
>                                         Node Rebalanced-files        
>      size       scanned  failures       skipped               status  
>     run time in secs
>                                    ---------  -----------  
>     -----------   -----------   -----------   -----------        
>     ------------     --------------
>                                    localhost  32203         2.0GB    
>        120858             0  5184          in progress            1294.00
>     gqas011.sbu.lab.eng.bos.redhat.com
>     <http://gqas011.sbu.lab.eng.bos.redhat.com>                0      
>      0Bytes             0 0             0               failed        
>           0.00
>     gqas016.sbu.lab.eng.bos.redhat.com
>     <http://gqas016.sbu.lab.eng.bos.redhat.com>             9364      
>     585.2MB         53121 0             0          in progress        
>        1294.00
>     gqas013.sbu.lab.eng.bos.redhat.com
>     <http://gqas013.sbu.lab.eng.bos.redhat.com>                0      
>      0Bytes         14750 0             0          in progress        
>        1294.00
>     gqas014.sbu.lab.eng.bos.redhat.com
>     <http://gqas014.sbu.lab.eng.bos.redhat.com>                0      
>      0Bytes             0 0             0               failed        
>           0.00
>     gqas015.sbu.lab.eng.bos.redhat.com
>     <http://gqas015.sbu.lab.eng.bos.redhat.com>                0      
>      0Bytes        196382 0             0          in progress        
>        1294.00
>     volume rebalance: testvol: success:
>
>     The hostnames are there if you want to poke around. I had a
>     problem with one of the added systems being on a different version
>     of glusterfs so I had to update everything to
>     glusterfs-3.8dev-0.99.git7d7b80e.el6.x86_64, remove the bricks I
>     just added, and add them back.  Something may have went wrong in
>     that process but I thought I did everything correctly.  I'll start
>     fresh tomorrow.  I figured I'd let this run over night.
>
>     -b
>
>
>
>
>     On Wed, Apr 29, 2015 at 9:48 PM, Benjamin Turner
>     <bennyturns at gmail.com <mailto:bennyturns at gmail.com>> wrote:
>
>         Sweet!  Here is the baseline:
>
>         [root at gqas001 ~]# gluster v rebalance testvol status
>         Node Rebalanced-files          size scanned      failures    
>           skipped         status   run time in secs
>          ---------      -----------   -----------   -----------  
>         -----------   -----------         ------------     --------------
>                                        localhost          1328575    
>            81.1GB       9402953             0             0  completed
>                   98500.00
>         gqas012.sbu.lab.eng.bos.redhat.com
>         <http://gqas012.sbu.lab.eng.bos.redhat.com>                0  
>              0Bytes       8000011             0             0
>          completed           51982.00
>         gqas003.sbu.lab.eng.bos.redhat.com
>         <http://gqas003.sbu.lab.eng.bos.redhat.com>                0  
>              0Bytes       8000011             0             0
>          completed           51982.00
>         gqas004.sbu.lab.eng.bos.redhat.com
>         <http://gqas004.sbu.lab.eng.bos.redhat.com>          1326290  
>              81.0GB       9708625             0             0
>          completed           98500.00
>         gqas013.sbu.lab.eng.bos.redhat.com
>         <http://gqas013.sbu.lab.eng.bos.redhat.com>                0  
>              0Bytes       8000011             0             0
>          completed           51982.00
>         gqas014.sbu.lab.eng.bos.redhat.com
>         <http://gqas014.sbu.lab.eng.bos.redhat.com>                0  
>              0Bytes       8000011             0             0
>          completed           51982.00
>         volume rebalance: testvol: success:
>
>         I'll have a run on the patch started tomorrow.
>
>         -b
>
>         On Wed, Apr 29, 2015 at 12:51 PM, Nithya Balachandran
>         <nbalacha at redhat.com <mailto:nbalacha at redhat.com>> wrote:
>
>
>             Doh my mistake, I thought it was merged.  I was just
>             running with the
>             upstream 3.7 daily.  Can I use this run as my baseline and
>             then I can run
>             next time on the patch to show the % improvement?  I'll
>             wipe everything and
>             try on the patch, any idea when it will be merged?
>
>             Yes, it would be very useful to have this run as the
>             baseline. The patch has just been merged in master. It
>             should be backported to 3.7 in a day or so.
>
>             Regards,
>             Nithya
>
>
>             > > > >
>             > > > > >
>             > > > > > On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran
>             > > > > > <nbalacha at redhat.com <mailto:nbalacha at redhat.com>>
>             > > > > > wrote:
>             > > > > >
>             > > > > > > That sounds great. Thanks.
>             > > > > > >
>             > > > > > > Regards,
>             > > > > > > Nithya
>             > > > > > >
>             > > > > > > ----- Original Message -----
>             > > > > > > From: "Benjamin Turner" <bennyturns at gmail.com
>             <mailto:bennyturns at gmail.com>>
>             > > > > > > To: "Nithya Balachandran" <nbalacha at redhat.com
>             <mailto:nbalacha at redhat.com>>
>             > > > > > > Cc: "Susant Palai" <spalai at redhat.com
>             <mailto:spalai at redhat.com>>, "Gluster Devel" <
>             > > > > > > gluster-devel at gluster.org
>             <mailto:gluster-devel at gluster.org>>
>             > > > > > > Sent: Wednesday, 22 April, 2015 12:14:14 AM
>             > > > > > > Subject: Re: [Gluster-devel] Rebalance
>             improvement design
>             > > > > > >
>             > > > > > > I am setting up a test env now, I'll have some
>             feedback for you
>             > this
>             > > > > > > week.
>             > > > > > >
>             > > > > > > -b
>             > > > > > >
>             > > > > > > On Tue, Apr 21, 2015 at 11:36 AM, Nithya
>             Balachandran
>             > > > > > > <nbalacha at redhat.com <mailto:nbalacha at redhat.com>
>             > > > > > > >
>             > > > > > > wrote:
>             > > > > > >
>             > > > > > > > Hi Ben,
>             > > > > > > >
>             > > > > > > > Did you get a chance to try this out?
>             > > > > > > >
>             > > > > > > > Regards,
>             > > > > > > > Nithya
>             > > > > > > >
>             > > > > > > > ----- Original Message -----
>             > > > > > > > From: "Susant Palai" <spalai at redhat.com
>             <mailto:spalai at redhat.com>>
>             > > > > > > > To: "Benjamin Turner" <bennyturns at gmail.com
>             <mailto:bennyturns at gmail.com>>
>             > > > > > > > Cc: "Gluster Devel"
>             <gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>>
>             > > > > > > > Sent: Monday, April 13, 2015 9:55:07 AM
>             > > > > > > > Subject: Re: [Gluster-devel] Rebalance
>             improvement design
>             > > > > > > >
>             > > > > > > > Hi Ben,
>             > > > > > > >  Uploaded a new patch here:
>             > http://review.gluster.org/#/c/9657/.
>             > > > > > > >  We
>             > > > > > > >  can
>             > > > > > > > start perf test on it. :)
>             > > > > > > >
>             > > > > > > > Susant
>             > > > > > > >
>             > > > > > > > ----- Original Message -----
>             > > > > > > > From: "Susant Palai" <spalai at redhat.com
>             <mailto:spalai at redhat.com>>
>             > > > > > > > To: "Benjamin Turner" <bennyturns at gmail.com
>             <mailto:bennyturns at gmail.com>>
>             > > > > > > > Cc: "Gluster Devel"
>             <gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>>
>             > > > > > > > Sent: Thursday, 9 April, 2015 3:40:09 PM
>             > > > > > > > Subject: Re: [Gluster-devel] Rebalance
>             improvement design
>             > > > > > > >
>             > > > > > > > Thanks Ben. RPM is not available and I am
>             planning to refresh
>             > the
>             > > > > > > > patch
>             > > > > > > in
>             > > > > > > > two days with some more regression fixes. I
>             think we can run
>             > the
>             > > > > > > > tests
>             > > > > > > post
>             > > > > > > > that. Any larger data-set will be good(say 3
>             to 5 TB).
>             > > > > > > >
>             > > > > > > > Thanks,
>             > > > > > > > Susant
>             > > > > > > >
>             > > > > > > > ----- Original Message -----
>             > > > > > > > From: "Benjamin Turner"
>             <bennyturns at gmail.com <mailto:bennyturns at gmail.com>>
>             > > > > > > > To: "Vijay Bellur" <vbellur at redhat.com
>             <mailto:vbellur at redhat.com>>
>             > > > > > > > Cc: "Susant Palai" <spalai at redhat.com
>             <mailto:spalai at redhat.com>>, "Gluster Devel" <
>             > > > > > > > gluster-devel at gluster.org
>             <mailto:gluster-devel at gluster.org>>
>             > > > > > > > Sent: Thursday, 9 April, 2015 2:10:30 AM
>             > > > > > > > Subject: Re: [Gluster-devel] Rebalance
>             improvement design
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > I have some rebalance perf regression stuff
>             I have been
>             > working on,
>             > > > > > > > is
>             > > > > > > > there an RPM with these patches anywhere so
>             that I can try it
>             > on my
>             > > > > > > > systems? If not I'll just build from:
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > git fetch git://
>             review.gluster.org/glusterfs
>             <http://review.gluster.org/glusterfs>
>             > > > > > > > refs/changes/57/9657/8
>             > > > > > > > &&
>             > > > > > > > git cherry-pick FETCH_HEAD
>             > > > > > > >
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > I will have _at_least_ 10TB of storage, how
>             many TBs of data
>             > should
>             > > > > > > > I
>             > > > > > > > run
>             > > > > > > > with?
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > -b
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur <
>             > vbellur at redhat.com <mailto:vbellur at redhat.com> >
>             > > > > > > wrote:
>             > > > > > > >
>             > > > > > > >
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > On 04/07/2015 03:08 PM, Susant Palai wrote:
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > Here is one test performed on a 300GB data
>             set and around
>             > 100%(1/2
>             > > > > > > > the
>             > > > > > > > time) improvement was seen.
>             > > > > > > >
>             > > > > > > > [root at gprfs031 ~]# gluster v i
>             > > > > > > >
>             > > > > > > > Volume Name: rbperf
>             > > > > > > > Type: Distribute
>             > > > > > > > Volume ID: 35562662-337e-4923-b862- d0bbb0748003
>             > > > > > > > Status: Started
>             > > > > > > > Number of Bricks: 4
>             > > > > > > > Transport-type: tcp
>             > > > > > > > Bricks:
>             > > > > > > > Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1
>             > > > > > > > Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1
>             > > > > > > > Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1
>             > > > > > > > Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > Added server 32 and started rebalance force.
>             > > > > > > >
>             > > > > > > > Rebalance stat for new changes:
>             > > > > > > > [root at gprfs031 ~]# gluster v rebalance
>             rbperf status
>             > > > > > > > Node Rebalanced-files size scanned failures
>             skipped status run
>             > time
>             > > > > > > > in
>             > > > > > > secs
>             > > > > > > > --------- ----------- -----------
>             ----------- -----------
>             > > > > > > > -----------
>             > > > > > > > ------------ --------------
>             > > > > > > > localhost 74639 36.1GB 297319 0 0 completed
>             1743.00
>             > > > > > > > 172.17.40.30 67512 33.5GB 269187 0 0
>             completed 1395.00
>             > > > > > > > gprfs029-10ge 79095 38.8GB 284105 0 0
>             completed 1559.00
>             > > > > > > > gprfs032-10ge 0 0Bytes 0 0 0 completed 402.00
>             > > > > > > > volume rebalance: rbperf: success:
>             > > > > > > >
>             > > > > > > > Rebalance stat for old model:
>             > > > > > > > [root at gprfs031 ~]# gluster v rebalance
>             rbperf status
>             > > > > > > > Node Rebalanced-files size scanned failures
>             skipped status run
>             > time
>             > > > > > > > in
>             > > > > > > secs
>             > > > > > > > --------- ----------- -----------
>             ----------- -----------
>             > > > > > > > -----------
>             > > > > > > > ------------ --------------
>             > > > > > > > localhost 86493 42.0GB 634302 0 0 completed
>             3329.00
>             > > > > > > > gprfs029-10ge 94115 46.2GB 687852 0 0
>             completed 3328.00
>             > > > > > > > gprfs030-10ge 74314 35.9GB 651943 0 0
>             completed 3072.00
>             > > > > > > > gprfs032-10ge 0 0Bytes 594166 0 0 completed
>             1943.00
>             > > > > > > > volume rebalance: rbperf: success:
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > This is interesting. Thanks for sharing &
>             well done! Maybe we
>             > > > > > > > should
>             > > > > > > > attempt a much larger data set and see how
>             we fare there :).
>             > > > > > > >
>             > > > > > > > Regards,
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > Vijay
>             > > > > > > >
>             > > > > > > >
>             > > > > > > > ______________________________ _________________
>             > > > > > > > Gluster-devel mailing list
>             > > > > > > > Gluster-devel at gluster.org
>             <mailto:Gluster-devel at gluster.org>
>             > > > > > > > http://www.gluster.org/
>             mailman/listinfo/gluster-devel
>             > > > > > > >
>             > > > > > > > _______________________________________________
>             > > > > > > > Gluster-devel mailing list
>             > > > > > > > Gluster-devel at gluster.org
>             <mailto:Gluster-devel at gluster.org>
>             > > > > > > >
>             http://www.gluster.org/mailman/listinfo/gluster-devel
>             > > > > > > > _______________________________________________
>             > > > > > > > Gluster-devel mailing list
>             > > > > > > > Gluster-devel at gluster.org
>             <mailto:Gluster-devel at gluster.org>
>             > > > > > > >
>             http://www.gluster.org/mailman/listinfo/gluster-devel
>             > > > > > > >
>             > > > > > >
>             > > > > >
>             > > > > _______________________________________________
>             > > > > Gluster-devel mailing list
>             > > > > Gluster-devel at gluster.org
>             <mailto:Gluster-devel at gluster.org>
>             > > > > http://www.gluster.org/mailman/listinfo/gluster-devel
>             > > > >
>             > > > _______________________________________________
>             > > > Gluster-devel mailing list
>             > > > Gluster-devel at gluster.org
>             <mailto:Gluster-devel at gluster.org>
>             > > > http://www.gluster.org/mailman/listinfo/gluster-devel
>             > > >
>             > >
>             >
>
>
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150501/b2ff9772/attachment-0001.html>


More information about the Gluster-devel mailing list