[Gluster-users] Problems since 3.12.7: invisible files, strange rebalance size, setxattr failed during rebalance and broken unix rights

Frank Ruehlemann ruehlemann at itsc.uni-luebeck.de
Mon Apr 23 13:22:35 UTC 2018


Hi,

after 2 years running GlusterFS without bigger problems we're facing
some strange errors lately.

After updating to 3.12.7 some user reported at least 4 broken
directories with some invisible files. The files are at the bricks and
don't start with a dot, but aren't visible in "ls". Clients still can
interact with them by using the explicit path.
More information: https://bugzilla.redhat.com/show_bug.cgi?id=1564071

And since this update gluster reported for the rebalance of >16900 PB
(Petabyte!) of data for one of our 2 server, when using „gluster volume
rebalance $myvolume status“. The time looks right, but the size of
transfered files is absurd. The rebalance was with 3.12.6 in March 2018.
The last rebalance log file listed no errors and a realistic size at the
end.

We started a new rebalance today during a downtime of our corresponding
compute cluster, since these errors started to spread and this might
help. The output of „gluster volume rebalance $myvolume status“ doesn't
list any errors so far and the numbers look like realistic values.
But we're seeing some strange errors (every few minutes) reports in the
journald:
„[2018-04-23 12:31:24.942377] E [MSGID: 113001]
[posix.c:5983:_posix_handle_xattr_keyvalue_pair] 0-$myvolume-posix:
setxattr failed
on /srv/glusterfs/bricks/DATA112/data/.glusterfs/e6/a8/e6a8ce50-fda5-4bad-8d4d-acd25dafcaa2 while doing xattrop: key=trusted.glusterfs.quota.1ce02d3b-b7ae-4485-903c-2991de5350b6.contri.1 [No such file or directory]“
The rebalance log file lists no errors.

Has anybody seen similar error messages during a rebalance?

And we see some files dublicated. There are two copies on different
bricks (we're running a distributed volume). 
One copy looks like this: 
$ ls -lah
-rwxr--r--  2 $user $group  293 May 11  2017 config

The other one looks rather strange:
$ ls -lah
---------T  2 root    $group    0 May 11  2017 config

Has anybody seen similar broken files?

We're using gluster 3.12 from the gluster.org-repositories on a standard
Debian 9 with XFS formatted bricks.

Hopefully somebody might have an answer how to fix this.

At least somebody in the future might find this, since we didn't found
anything while searching after these errors. If you're from the future:
Good luck! (^_^)

So far,

-- 
Frank Rühlemann
   IT-Systemtechnik

UNIVERSITÄT ZU LÜBECK
    IT-Service-Center
    
    Ratzeburger Allee 160
    23562 Lübeck
    Tel +49 451 3101 2034
    Fax +49 451 3101 2004
    ruehlemann at itsc.uni-luebeck.de
    www.itsc.uni-luebeck.de






More information about the Gluster-users mailing list