[Gluster-users] Pausing rebalance

Tue Dec 10 06:32:46 UTC 2013

Thanks for clearing that up. I had to wait about 30 minutes for all
rebalancing activity to cease, then I was able to add a new brick.

What does it use to migrate the files? The copy rate was pretty slow
considering both bricks were on the same server, I only saw about
200MB/Sec. Each brick is a 16 disk ZFS raidz2, copying with dd I can get
well over 500MB/Sec.

On Tue, 2013-12-10 at 11:30 +0530, Kaushal M wrote: 
> On Tue, Dec 10, 2013 at 11:09 AM, Franco Broi <franco.broi at iongeo.com> wrote:
> > On Tue, 2013-12-10 at 10:56 +0530, shishir gowda wrote:
> >> Hi Franco,
> >>
> >>
> >> If a file is under migration, and a rebalance stop is encountered,
> >> then rebalance process exits only after the completion of the
> >> migration.
> >>
> >> That might be one of the reasons why you saw rebalance in progress
> >> message while trying to add the brick
> >
> > The status said it was stopped. I didn't do a top on the machine but are
> > you saying that it was still rebalancing despite saying it had stopped?
> >
> 
> The 'stopped' status is a little bit misleading. The rebalance process
> could have been migrating a large file when the stop command was
> issued, so the process would continue migrating that file and quit
> once it finished. In this time period, though the status says
> 'stopped' the rebalance process is actually running, which prevents
> other operations from happening. Ideally, we would have a 'stopping'
> status which would convey the correct meaning. But for now we can only
> verify that a rebalance process has actually stopped by monitoring the
> actual rebalance process. The rebalance process is a 'glusterfs'
> process with some arguments containing rebalance.
> 
> >>
> >> Could you please share the average file size in your setup?
> >>
> >
> > Bit hard to say, I just copied some data from our main processing
> > system. The sizes range from very small to 10's of gigabytes.
> >
> >>
> >> You could always check the rebalance status command to ensure
> >> rebalance has indeed completed/stopped before proceeding with the
> >> add-brick. Using add-brick force while rebalance is on-going should
> >> not be used in normal scenarios. I do see that in your case, they show
> >> stopped/completed. Glusterd logs would help in triaging the issue.
> >
> > See attached.
> >
> >>
> >>
> >> Rebalance re-writes layouts, and migrates data. While this is
> >> happening, if a add-brick is done, then the cluster might go into a
> >> imbalanced stated. Hence, the check if rebalance is in progress while
> >> doing add-brick
> >
> > I can see that but as far as I could tell, the rebalance had stopped
> > according to the status.
> >
> > Just to be clear, what command restarts the rebalancing?
> >
> >>
> >>
> >> With regards,
> >> Shishir
> >>
> >>
> >>
> >> On 10 December 2013 10:39, Franco Broi <franco.broi at iongeo.com> wrote:
> >>
> >>         Before attempting a rebalance on my existing distributed
> >>         Gluster volume
> >>         I thought I'd do some testing with my new storage. I created a
> >>         volume
> >>         consisting of 4 bricks on the same server and wrote some data
> >>         to it. I
> >>         then added a new brick from a another server. I ran the
> >>         fix-layout and
> >>         wrote some new files and could see them on the new brick. All
> >>         good so
> >>         far, so I started the data rebalance. After it had been
> >>         running for a
> >>         while I wanted to add another brick, which I obviously
> >>         couldn't do while
> >>         it was running so I stopped it. Even with it stopped It
> >>         wouldn't let me
> >>         add a brick so I tried restarting it, but it wouldn't let me
> >>         do that
> >>         either. I presume you just reissue the start command as
> >>         there's no
> >>         restart?
> >>
> >>         [root at nas3 ~]# gluster vol rebalance test-volume status
> >>                                             Node Rebalanced-files
> >>              size       scanned      failures       skipped
> >>         status run time in secs
> >>         ---------      -----------   -----------   -----------
> >>         -----------   -----------   ------------   --------------
> >>         localhost                7       611.7GB          1358
> >>         0            10        stopped          4929.00
> >>         localhost                7       611.7GB          1358
> >>         0            10        stopped          4929.00
> >>          nas4-10g                0        0Bytes          1506
> >>         0             0      completed             8.00
> >>         volume rebalance: test-volume: success:
> >>         [root at nas3 ~]# gluster vol add-brick test-volume
> >>         nas4-10g:/data14/gvol
> >>         volume add-brick: failed: Volume name test-volume rebalance is
> >>         in progress. Please retry after completion
> >>         [root at nas3 ~]# gluster vol rebalance test-volume start
> >>         volume rebalance: test-volume: failed: Rebalance on
> >>         test-volume is already started
> >>
> >>         In the end I used the force option to make it start but was
> >>         that the
> >>         right thing to do?
> >>
> >>         glusterfs 3.4.1 built on Oct 28 2013 11:01:59
> >>         Volume Name: test-volume
> >>         Type: Distribute
> >>         Volume ID: 56ee0173-aed1-4be6-a809-ee0544f9e066
> >>         Status: Started
> >>         Number of Bricks: 5
> >>         Transport-type: tcp
> >>         Bricks:
> >>         Brick1: nas3-10g:/data9/gvol
> >>         Brick2: nas3-10g:/data10/gvol
> >>         Brick3: nas3-10g:/data11/gvol
> >>         Brick4: nas3-10g:/data12/gvol
> >>         Brick5: nas4-10g:/data13/gvol
> >>
> >>
> >>         _______________________________________________
> >>         Gluster-users mailing list
> >>         Gluster-users at gluster.org
> >>         http://supercolony.gluster.org/mailman/listinfo/gluster-users
> >>
> >>
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://supercolony.gluster.org/mailman/listinfo/gluster-users