[Gluster-users] How to trigger a resync of a newly replaced empty brick in replicate config ?
Serkan Çoban
cobanserkan at gmail.com
Fri Feb 2 10:38:25 UTC 2018
If I were you I follow the following steps. Stop the rebalance and fix
the cluster health first.
Bring up the down server, replace server4:brick4 with a new disk,
format and be sure it is started, then start a full heal.
Without all bricks up full heal will not start. The you can continue
with rebalance.
On Fri, Feb 2, 2018 at 1:27 PM, Alessandro Ipe <Alessandro.Ipe at meteo.be> wrote:
> Hi,
>
>
> I simplified the config in my first email, but I actually have 2x4 servers in replicate-distribute with each 4 bricks for 6 of them and 2 bricks for the remaining 2. Full healing will just take ages... for a just single brick to resync !
>
>> gluster v status home
> volume status home
> Status of volume: home
> Gluster process TCP Port RDMA Port Online Pid
> ------------------------------------------------------------------------------
> Brick server1:/data/glusterfs/home/brick1 49157 0 Y 5003
> Brick server1:/data/glusterfs/home/brick2 49153 0 Y 5023
> Brick server1:/data/glusterfs/home/brick3 49154 0 Y 5004
> Brick server1:/data/glusterfs/home/brick4 49155 0 Y 5011
> Brick server3:/data/glusterfs/home/brick1 49152 0 Y 5422
> Brick server4:/data/glusterfs/home/brick1 49152 0 Y 5019
> Brick server3:/data/glusterfs/home/brick2 49153 0 Y 5429
> Brick server4:/data/glusterfs/home/brick2 49153 0 Y 5033
> Brick server3:/data/glusterfs/home/brick3 49154 0 Y 5437
> Brick server4:/data/glusterfs/home/brick3 49154 0 Y 5026
> Brick server3:/data/glusterfs/home/brick4 49155 0 Y 5444
> Brick server4:/data/glusterfs/home/brick4 N/A N/A N N/A
> Brick server5:/data/glusterfs/home/brick1 49152 0 Y 5275
> Brick server6:/data/glusterfs/home/brick1 49152 0 Y 5786
> Brick server5:/data/glusterfs/home/brick2 49153 0 Y 5276
> Brick server6:/data/glusterfs/home/brick2 49153 0 Y 5792
> Brick server5:/data/glusterfs/home/brick3 49154 0 Y 5282
> Brick server6:/data/glusterfs/home/brick3 49154 0 Y 5794
> Brick server5:/data/glusterfs/home/brick4 49155 0 Y 5293
> Brick server6:/data/glusterfs/home/brick4 49155 0 Y 5806
> Brick server7:/data/glusterfs/home/brick1 49156 0 Y 22339
> Brick server8:/data/glusterfs/home/brick1 49153 0 Y 17992
> Brick server7:/data/glusterfs/home/brick2 49157 0 Y 22347
> Brick server8:/data/glusterfs/home/brick2 49154 0 Y 18546
> NFS Server on localhost 2049 0 Y 683
> Self-heal Daemon on localhost N/A N/A Y 693
> NFS Server on server8 2049 0 Y 18553
> Self-heal Daemon on server8 N/A N/A Y 18566
> NFS Server on server5 2049 0 Y 23115
> Self-heal Daemon on server5 N/A N/A Y 23121
> NFS Server on server7 2049 0 Y 4201
> Self-heal Daemon on server7 N/A N/A Y 4210
> NFS Server on server3 2049 0 Y 5460
> Self-heal Daemon on server3 N/A N/A Y 5469
> NFS Server on server6 2049 0 Y 22709
> Self-heal Daemon on server6 N/A N/A Y 22718
> NFS Server on server4 2049 0 Y 6044
> Self-heal Daemon on server4 N/A N/A Y 6243
>
> server 2 is currently powered off as we are waiting a replacement RAID controller, as well as for
> server4:/data/glusterfs/home/brick4
>
> And as I said, there is a rebalance in progress
>> gluster rebalance home status
> Node Rebalanced-files size scanned failures skipped status run time in h:m:s
> --------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
> localhost 42083 23.3GB 1568065 1359 303734 in progress 16:49:31
> server5 35698 23.8GB 1027934 0 240748 in progress 16:49:23
> server4 35096 23.4GB 899491 0 229064 in progress 16:49:18
> server3 27031 18.0GB 701759 8 182592 in progress 16:49:27
> server8 0 0Bytes 327602 0 805 in progress 16:49:18
> server6 35672 23.9GB 1028469 0 240810 in progress 16:49:17
> server7 1 45Bytes 53 0 0 completed 0:03:53
> Estimated time left for rebalance to complete : 359739:51:24
> volume rebalance: home: success
>
>
> Thanks,
>
>
> A.
>
>
>
> On Thursday, 1 February 2018 18:57:17 CET Serkan Çoban wrote:
>> What is server4? You just mentioned server1 and server2 previously.
>> Can you post the output of gluster v status volname
>>
>> On Thu, Feb 1, 2018 at 8:13 PM, Alessandro Ipe <Alessandro.Ipe at meteo.be> wrote:
>> > Hi,
>> >
>> >
>> > Thanks. However "gluster v heal volname full" returned the following error
>> > message
>> > Commit failed on server4. Please check log file for details.
>> >
>> > I have checked the log files in /var/log/glusterfs on server4 (by grepping
>> > heal), but did not get any match. What should I be looking for and in
>> > which
>> > log file, please ?
>> >
>> > Note that there is currently a rebalance process running on the volume.
>> >
>> >
>> > Many thanks,
>> >
>> >
>> > A.
>> >
>> > On Thursday, 1 February 2018 17:32:19 CET Serkan Çoban wrote:
>> >> You do not need to reset brick if brick path does not change. Replace
>> >> the brick format and mount, then gluster v start volname force.
>> >> To start self heal just run gluster v heal volname full.
>> >>
>> >> On Thu, Feb 1, 2018 at 6:39 PM, Alessandro Ipe <Alessandro.Ipe at meteo.be>
>> >
>> > wrote:
>> >> > Hi,
>> >> >
>> >> >
>> >> > My volume home is configured in replicate mode (version 3.12.4) with
>> >> > the
>> >> > bricks server1:/data/gluster/brick1
>> >> > server2:/data/gluster/brick1
>> >> >
>> >> > server2:/data/gluster/brick1 was corrupted, so I killed gluster daemon
>> >> > for
>> >> > that brick on server2, umounted it, reformated it, remounted it and did
>> >> > a>
>> >> >
>> >> >> gluster volume reset-brick home server2:/data/gluster/brick1
>> >> >> server2:/data/gluster/brick1 commit force>
>> >> >
>> >> > I was expecting that the self-heal daemon would start copying data from
>> >> > server1:/data/gluster/brick1 (about 7.4 TB) to the empty
>> >> > server2:/data/gluster/brick1, which it only did for directories, but
>> >> > not
>> >> > for files.
>> >> >
>> >> > For the moment, I launched on the fuse mount point
>> >> >
>> >> >> find . | xargs stat
>> >> >
>> >> > but crawling the whole volume (100 TB) to trigger self-healing of a
>> >> > single
>> >> > brick of 7.4 TB is unefficient.
>> >> >
>> >> > Is there any trick to only self-heal a single brick, either by setting
>> >> > some attributes to its top directory, for example ?
>> >> >
>> >> >
>> >> > Many thanks,
>> >> >
>> >> >
>> >> > Alessandro
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Gluster-users mailing list
>> >> > Gluster-users at gluster.org
>> >> > http://lists.gluster.org/mailman/listinfo/gluster-users
>> >
>> > --
>> >
>> > Dr. Ir. Alessandro Ipe
>> > Department of Observations Tel. +32 2 373 06 31
>> > Remote Sensing from Space
>> > Royal Meteorological Institute
>> > Avenue Circulaire 3 Email:
>> > B-1180 Brussels Belgium Alessandro.Ipe at meteo.be
>> > Web: http://gerb.oma.be
>
>
> --
>
> Dr. Ir. Alessandro Ipe
> Department of Observations Tel. +32 2 373 06 31
> Remote Sensing from Space
> Royal Meteorological Institute
> Avenue Circulaire 3 Email:
> B-1180 Brussels Belgium Alessandro.Ipe at meteo.be
> Web: http://gerb.oma.be
>
>
>
More information about the Gluster-users
mailing list