[Gluster-users] poor performance during healing

Kingsley gluster at gluster.dogwind.com
Tue Feb 24 11:16:38 UTC 2015


When testing gluster, I found similar issues when I simulated a brick
failure on a replicated volume - while it was rebuilding the newly
replaced brick, the volume was very unresponsive.

Our bricks are on SATA drives and the server LAN runs at 1Gbps. The
disks couldn't cope with the IOPS that the network was throwing at them.

I solved that particular issue by using traffic shaping to limit the
network bandwidth that the servers could use between each other (but not
limiting it to anywhere else). The volume took longer to rebuild the
replaced brick, but the volume was still responsive to clients during
the rebuild.

Please let me know if what we tried is a bad idea ...

Cheers,
Kingsley.

On Tue, 2015-02-24 at 07:11 +0530, Ravishankar N wrote:
> On 02/24/2015 05:00 AM, Craig Yoshioka wrote:
> > I’m using Gluster 3.6 to host a volume with some KVM images.  I’d seen before that other people were having terrible performance while Gluster was auto-healing but that a rewrite in 3.6 had potentially solved this problem.
> >
> > Well, it hasn’t (for me).  If my gluster volume starts to auto-heal, performance can get so bad that some of the VMs essentially lock up.  In top I can see the glusterfsd process sometime hitting 700% of the CPU.  Is there anything I can do to prevent this by throttling the healing process?
> For VM workloads, you could set the 'cluster.data-self-heal-algorithm' 
> option to 'full'. The checksum computation in the 'diff' algorithm can 
> be cpu intensive, especially since VM images are big files.
> 
> [root at tuxpad glusterfs]# gluster v set help|grep algorithm
> Option: cluster.data-self-heal-algorithm
> Description: Select between "full", "diff". The "full" algorithm copies 
> the entire file from source to sink. The "diff" algorithm copies to sink 
> only those blocks whose checksums don't match with those of source. If 
> no option is configured the option is chosen dynamically as follows: If 
> the file does not exist on one of the sinks or empty file exists or if 
> the source file size is about the same as page size the entire file will 
> be read and written i.e "full" algo, otherwise "diff" algo is chosen.
> 
> Hope this helps.
> Ravi
> 
> > Here are my volume options:
> >
> > Volume Name: vm-images
> > Type: Replicate
> > Volume ID: 5b38ddbe-a1ae-4e10-b0ad-dcd785a44493
> > Status: Started
> > Number of Bricks: 1 x 2 = 2
> > Transport-type: tcp
> > Bricks:
> > Brick1: vmhost-1:/gfs/brick-0
> > Brick2: vmhost-2:/gfs/brick-0
> > Options Reconfigured:
> > nfs.disable: on
> > cluster.quorum-count: 1
> > network.frame-timeout: 1800
> > network.ping-timeout: 15
> > server.allow-insecure: on
> > storage.owner-gid: 36
> > storage.owner-uid: 107
> > performance.quick-read: off
> > performance.read-ahead: off
> > performance.io-cache: off
> > performance.stat-prefetch: off
> > cluster.eager-lock: enable
> > network.remote-dio: enable
> > cluster.quorum-type: fixed
> > cluster.server-quorum-type: server
> > cluster.server-quorum-ratio: 51%
> >
> > Thanks!
> > -Craig
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users



More information about the Gluster-users mailing list