[Gluster-users] Sudden, dramatic performance drops with Glusterfs

Fri Nov 8 08:32:44 UTC 2019

Hi Strahil,

Thanks for the reply. See below.

Also, as an aside, I tested by installing a single Cenots 7 machine with 
the ZBOD, installed gluster and ZFSonLinux as recommended at..
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Gluster%20On%20ZFS/

And created a gluster volume consisting of one brick made up of a local 
ZFS raidz2, copied about 4 TB of data to it, and am having the same issue.

The biggest part of the issue is with things like "ls" and "find". IF I 
read a single file, or write a single file it works great. But if I run 
rsync (which does alot of listing, writing, renaming, etc) it is slow as 
garbage. I.e. a find command that will finish in 30 seconds when run 
directly on the underlying ZFS directory, takes about an hour.

Strahil wrote on 08-Nov-19 05:39:
>
> Hi Michael,
>
> What is your 'gluster volume info <VOL> ' showing.
>
I've been playing with the install (since it's a fresh machine) so I 
can't give you verbatim output. However, it was showing two bricks, one 
on each server, started, and apparently healthy.
>
> How much is your zpool full ? Usually when it gets too full, the ZFS 
> performance drops seriosly.
>
The zpool is only at about 30% usage. It's a new server setup.
We have about 10TB of data on a 30TB volume (made up of two 30TB ZFS 
raidz2 bricks, each residing on different servers, via a 10GB dedicated 
Ethernet connection.)
>
> Try to rsync a file directly to one of the bricks, then to the other 
> brick (don't forget to remove the files after that, as gluster will 
> not know about them).
>
If I rsync manually, or scp a file directly to the zpool bricks (outside 
of gluster) I get 30-100MBytes/s (depending on what I'm copying.)
If I rsync THROUGH gluster (via the glusterfs mounts) I get 1 - 5MB/s
>
> What are your mounting options ? Usually 'noatime,nodiratime' are a 
> good start.
>
I'll try these. Currently using ...
(mounting TO serverA) serverA:/homes /glusterfs/homes    glusterfs 
defaults,_netdev 0 0
>
> Are you using ZFS provideed by Ubuntu packagees or directly from ZOL 
> project ?
>
ZFS provided by Ubuntu 18 repo...
   libzfs2linux/bionic-updates,now 0.7.5-1ubuntu16.6 amd64 
[installed,automatic]
   zfs-dkms/bionic-updates,bionic-updates,now 0.7.5-1ubuntu16.6 all 
[installed]
   zfs-zed/bionic-updates,now 0.7.5-1ubuntu16.6 amd64 [installed,automatic]
   zfsutils-linux/bionic-updates,now 0.7.5-1ubuntu16.6 amd64 [installed]

Gluster provided by. "add-apt-repository ppa:gluster/glusterfs-5" ...
   glusterfs 5.10
   Repository revision: git://git.gluster.org/glusterfs.git

> Best Regards,
> Strahil Nikolov
>
> On Nov 6, 2019 12:50, Michael Rightmire <Michael.Rightmire at KIT.edu> wrote:
>
>     Hello list!
>
>     I'm new to Glusterfs in general. We have chosen to use it as our
>     distributed file system on a new set of HA file servers.
>
>     The setup is:
>     2 SUPERMICRO SuperStorage Server 6049PE1CR36L with 24-4TB spinning
>     disks and NVMe for cache and slog.
>     HBA not RAID card
>     Ubuntu 18.04 server (on both systems)
>     ZFS filestorage
>     Glusterfs 5.10
>
>     Step one was to install Ubuntu, ZFS, and gluster. This all went
>     without issue.
>     We have 3 ZFS raidz2 identical on both servers
>     We have three glusterfs mirrored volumes - 1 attached to each
>     raidz on each server. I.e.
>
>     And mounted the gluster volumes as (for example) "/glusterfs/homes
>     -> /zpool/homes". I.e.
>     gluster volume create homes replica 2 transport tcp
>     server1:/zpool-homes/homes server2:/zpool-homes/homes force
>     (on server1) server1:/homes     44729413504 16032705152
>     28696708352  36% /glusterfs/homes
>
>     The problem is, the performance has deteriorated terribly.
>     We needed to copy all of our data from the old server to the new
>     glusterfs volumes (appx. 60TB).
>     We decided to do this with multiple rsync commands (like 400
>     simultanous rsyncs)
>     The copy went well for the first 4 days, with an average across
>     all rsyncs of 150-200 MBytes per second.
>     Then, suddenly, on the fourth day, it dropped to about 50 MBytes/s.
>     Then, by the end of the day, down to ~5MBytes/s (five).
>     I've stopped the rsyncs, and Ican still copy an individual file
>     across to the glusterfs shared directory at 100MB/s.
>     But actions such as "ls -la" or "find" take forever!
>
>     Are there obvious flaws in my setup to correct?
>     How can I better troubleshoot this?
>
>     Thanks!
>     -- 
>
>     Mike
>

-- 

Mike

Karlsruher Institut für Technologie (KIT)

Institut für Anthropomatik und Robotik (IAR)

Hochperformante Humanoide Technologien (H2T)

Michael Rightmire

B.Sci, HPUXCA, MCSE, MCP, VDB, ISCB

Systems IT/Development

Adenauerring 2 , Gebäude 50.20, Raum 022

76131 Karlsruhe

Telefon: +49 721 608-45032

Fax:+49 721 608-44077

E-Mail:Michael.Rightmire at kit.edu

http://www.humanoids.kit.edu/

http://h2t.anthropomatik.kit.edu <http://h2t.anthropomatik.kit.edu/>

KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft

Das KIT ist seit 2010 als familiengerechte Hochschule zertifiziert

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20191108/65866600/attachment.html>