[Gluster-users] files disappearing and re-appearing

Riccardo Murri rmurri at uzh.ch
Thu Nov 17 13:52:18 UTC 2016


we are trying out GlusterFS as the working filesystem for a compute cluster; 
the cluster is comprised of 57 compute nodes (55 cores each), acting as 
GlusterFS clients, and 25 data server nodes (8 cores each), serving 
1 large GlusterFS brick each.

We currently have noticed a couple of issues:

1) When compute jobs run, the `glusterfs` client process on the compute nodes
goes up to 100% CPU, and filesystem operations start to slow down a lot.  
Since there are many CPUs available, is it possible to make it use, e.g., 
4 CPUs instead of one to make it more responsive?

2) In addition (but possibly related to 1) we have an issue with files 
disappearing and re-appearing: from a compute process we test for the existence
of a file and e.g. `test -e /glusterfs/file.txt` fails.  Then we test from
a different process or shell and the file is there.  As far as I can see,
the servers are basically idle, and none of the peers is disconnected.

We are running GlusterFS 3.7.17 on Ubuntu 16.04, installed from the Launchpad PPA.
(Details below for the interested.)

Can you give any hint about what's going on?


Installation details:

ubuntu at master001:~$ pdsh -a 'glusterfs --version | fgrep built' | dshbak -c
glusterfs 3.7.17 built on Nov  4 2016 13:39:51
ubuntu at master001:~$ dpkg -S $(which glusterfs)
glusterfs-client: /usr/sbin/glusterfs
ubuntu at master001:~$ apt-cache policy glusterfs-client 
  Installed: 3.7.17-ubuntu1~xenial5
  Candidate: 3.7.17-ubuntu1~xenial5
  Version table:
 *** 3.7.17-ubuntu1~xenial5 500
        500 http://ppa.launchpad.net/gluster/glusterfs-3.7/ubuntu xenial/main amd64 Packages
        100 /var/lib/dpkg/status
     3.7.6-1ubuntu1 500
        500 http://nova.clouds.archive.ubuntu.com/ubuntu xenial/universe amd64 Packages
        500 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages

