[Gluster-devel] Slow volume, gluster volume status bug

Emmanuel Dreyfus manu at netbsd.org
Mon Nov 13 16:40:00 UTC 2017


I am looking for hints about how to debug this:

I have a 4x2 Distributed-Replicate volume which exhibits extremely slow
operations. Example:
# time stat /gfs/dl
51969 10143657874486987692 drwxr-xr-x 4 _httpd wheel 172912968 4096 "Nov 13 17:22:12 2017" "Sep 22 11:53:35 2017" "Sep 22 11:53:35 2017" "Jan  1 01:00:00 1970" 131072 8 0 /gfs/dl
    8.72s real     0.00s user     0.01s system

But the thing is not 100% reproductible. Sometime I get an isntant
(normal) response.

gluster volume status also exhibits trouble: each server will only 
list its bricks, but not the other's one. I suspect it could just
be some tiemout because of slow answer from the peer.

tcpdump tells me that the server can take seconds to answer. 
Brick logs show nothing special.

Any idea?

Emmanuel Dreyfus
manu at netbsd.org

