[Gluster-devel] Slow write speed with replica vol when link has latency

Jan Engelhardt jengelh at inai.de
Sat Sep 29 20:30:42 UTC 2012


Hi,

in a 2-brick "replica 2" volume (naturally spanning 2 hosts via TCP),
write speed floats at a measly 1 Mbyte/s, despite the hosts having a
proven faster connection (simply socat /dev/zero to a remote dest).
This occurs with gluster-3.3.1qa3, and also has with versions before
it.

I have seen the "magic limit" of ~1-2 Mbyte/s before -- in at least
two other network filesystems -- when they are using synchronous
operations, which I am suspecting glusterfs is also doing.

# ping -fqc1000 bach
[...]
rtt min/avg/max/mdev = 0.412/0.578/1.012/0.052 ms, ipg/ewma 0.607/0.573 ms

# gluster volume status d0
Status of volume: d0
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick mozart:/sync/.gluster-store                       49152   Y       28353
Brick bach:/sync/.gluster-store                         49152   Y       13094
NFS Server on localhost                                 38467   Y       28373
Self-heal Daemon on localhost                           N/A     Y       28383
NFS Server on bach                                      38467   Y       13103
Self-heal Daemon on bach                                N/A     Y       13110
 
# gluster volume info d0
Volume Name: d0
Type: Replicate
Volume ID: 09386acc-7149-4c9c-b8f2-e6ed4104435b
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: mozart:/sync/.gluster-store
Brick2: bach:/sync/.gluster-store

# wget -c ftp://ftp5.gwdg.de/pub/opensuse/distribution/12.2/iso/openSUSE-12.2-DVD-x86_64.iso
--2012-09-29 21:08:06--  ftp://ftp5.gwdg.de/pub/opensuse/distribution/12.2/iso/openSUSE-12.2-DVD-x86_64.iso
[...]
100%[+===================================>] 4,669,308,928  974K/s   in 64m 27s
2012-09-29 22:12:35 (1.10 MB/s) - `openSUSE-12.2-DVD-x86_64.iso' saved [4669308928]


The sync operation would, coupled with the common latency of
a 100 or 1000Mbit/s Ethernet link, lead to these rates.

In comparison, a replica over loopback is much faster thanks
to the much lower link latency.

# ping -fqc1000 localhost
rtt min/avg/max/mdev = 0.003/0.003/0.024/0.001 ms, ipg/ewma 0.005/0.004 ms

# gluster volume create d1 replica 2 transport tcp mozart:/dev/shm/store1 mozart:/dev/shm/store2
Multiple bricks of a replicate volume are present on the same server. This setup is not optimal.
Do you still want to continue creating the volume?  (y/n) y
Creation of volume d1 has been successful. Please start the volume to access data.
22:25 mozart:~ # gluster volume start d1
Starting volume d1 has been successful

# wget -c ftp://ftp5.gwdg.de/pub/opensuse/distribution/12.2/iso/openSUSE-12.2-DVD-x86_64.iso
 3% [>              ] 179,011,840 45.4M/s  eta 2m 6s   ^C


How can I boost the write speed with volumes connected over
latent Ethernet? Are there any plans to implement an asynchronous
write mode in glusterfs?




More information about the Gluster-devel mailing list