[Gluster-users] Problems since 3.12.7: invisible files, strange rebalance size, setxattr failed during rebalance and broken unix rights

Mon Apr 23 16:21:58 UTC 2018

Hi,

On 23 April 2018 at 18:52, Frank Ruehlemann <ruehlemann at itsc.uni-luebeck.de>
wrote:

> Hi,
>
> after 2 years running GlusterFS without bigger problems we're facing
> some strange errors lately.
>
> After updating to 3.12.7 some user reported at least 4 broken
> directories with some invisible files. The files are at the bricks and
> don't start with a dot, but aren't visible in "ls". Clients still can
> interact with them by using the explicit path.
> More information: https://bugzilla.redhat.com/show_bug.cgi?id=1564071

I will continue the analysis for this issue in the bug.

>
>
> And since this update gluster reported for the rebalance of >16900 PB
> (Petabyte!) of data for one of our 2 server, when using „gluster volume
> rebalance $myvolume status“. The time looks right, but the size of
> transfered files is absurd. The rebalance was with 3.12.6 in March 2018.
> The last rebalance log file listed no errors and a realistic size at the
> end.
>

This has been seen a few times and is because an incorrect value is stored
in the node_state.info file . However, I don't know what causes this
incorrect value to be stored. It is harmless and can be ignored.

>
> We started a new rebalance today during a downtime of our corresponding
> compute cluster, since these errors started to spread and this might
> help. The output of „gluster volume rebalance $myvolume status“ doesn't
> list any errors so far and the numbers look like realistic values.
> But we're seeing some strange errors (every few minutes) reports in the
> journald:
> „[2018-04-23 12:31:24.942377] E [MSGID: 113001]
> [posix.c:5983:_posix_handle_xattr_keyvalue_pair] 0-$myvolume-posix:
> setxattr failed
> on /srv/glusterfs/bricks/DATA112/data/.glusterfs/e6/a8/
> e6a8ce50-fda5-4bad-8d4d-acd25dafcaa2 while doing xattrop:
> key=trusted.glusterfs.quota.1ce02d3b-b7ae-4485-903c-2991de5350b6.contri.1
> [No such file or directory]“
> The rebalance log file lists no errors.
>
> Has anybody seen similar error messages during a rebalance?
>

Are any directories being deleted/renamed during the rebalance? If yes,
this could be a valid message.

>
> And we see some files dublicated. There are two copies on different
> bricks (we're running a distributed volume).
> One copy looks like this:
> $ ls -lah
> -rwxr--r--  2 $user $group  293 May 11  2017 config
>
> The other one looks rather strange:
> $ ls -lah
> ---------T  2 root    $group    0 May 11  2017 config
>
> Has anybody seen similar broken files?
>

This is fine as long as you only see a single file from the mount point.
The 'T' files are internal gluster files (called linkto files) and should
be invisible from the mount point.

Regards,
Nithya

>
> We're using gluster 3.12 from the gluster.org-repositories on a standard
> Debian 9 with XFS formatted bricks.
>
> Hopefully somebody might have an answer how to fix this.
>
> At least somebody in the future might find this, since we didn't found
> anything while searching after these errors. If you're from the future:
> Good luck! (^_^)
>
> So far,
>
> --
> Frank Rühlemann
>    IT-Systemtechnik
>
> UNIVERSITÄT ZU LÜBECK
>     IT-Service-Center
>
>     Ratzeburger Allee 160
>     23562 Lübeck
>     Tel +49 451 3101 2034
>     Fax +49 451 3101 2004
>     ruehlemann at itsc.uni-luebeck.de
>     www.itsc.uni-luebeck.de
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180423/58a173f1/attachment.html>