[Gluster-users] Problems with Gluster NFS export (unable to stat files)
fredrik ronnvall
fredrik at realisestudio.com
Tue May 29 16:58:21 UTC 2012
Hi,
We're seeing some random errors quite frequently mounting one of our
volumes via NFS. At random a client will fail to access certain
files/directories, they show up like this:
$ ls -l
ls: cannot access xxx: No such file or directory
ls: cannot access yyy: No such file or directory
l????????? ? ? ? ? ? xxx
l????????? ? ? ? ? ? yyy
drwxrwxrwx 2 user group 95 2012-05-08 18:11 zzz
Tracing back the NFS mount to one of the gluster servers, this shows
up in nfs.log:
[2012-05-09 14:47:32.807853] E
[client3_1-fops.c:411:client3_1_stat_cbk] 0-glustervol1-client-2:
remote operation failed: No such file or directory
[2012-05-09 14:47:32.808430] E
[client3_1-fops.c:411:client3_1_stat_cbk] 0-glustervol1-client-3:
remote operation failed: No such file or directory
[2012-05-09 14:47:32.841125] E
[client3_1-fops.c:411:client3_1_stat_cbk] 0-glustervol1-client-3:
remote operation failed: No such file or directory
[2012-05-09 14:47:32.841762] E
[client3_1-fops.c:411:client3_1_stat_cbk] 0-glustervol1-client-2:
remote operation failed: No such file or directory
Restarting the gluster server seems to fix the issue, though I am
unhappy with this solution.
Today this showed up in the logs following the same symptoms:
[2012-05-29 10:19:04.332031] E
[afr-self-heal-metadata.c:561:afr_sh_metadata_post_nonblocking_inodelk_cbk]
0-glustervol1-replicate-3: Non Blocking metadata inodelks failed for
<path>.
[2012-05-29 10:19:04.332059] E
[afr-self-heal-metadata.c:563:afr_sh_metadata_post_nonblocking_inodelk_cbk]
0-glustervol1-replicate-3: Metadata self-heal failed for <path>.
[2012-05-29 10:19:04.332503] E
[afr-self-heal-metadata.c:561:afr_sh_metadata_post_nonblocking_inodelk_cbk]
0-glustervol1-replicate-2: Non Blocking metadata inodelks failed for
<path>.
[2012-05-29 10:19:04.332534] E
[afr-self-heal-metadata.c:563:afr_sh_metadata_post_nonblocking_inodelk_cbk]
0-glustervol1-replicate-2: Metadata self-heal failed for <path>.
A restart of gluster on the server the client was connected to from
solved the issue.
This seems to happen several times a day and is becoming a serious
issue. The problem frequently happens to symlinks, however regular
files are also affected.
The volume in question is configured across 4 servers (OpenSUSE 11.3)
with 2 bricks per server as distributed-replicate. Gluster version is
3.2.5.
Has anyone experienced similar issues? Is there a sanity check of
sorts that I could carry out?
Fredrik
More information about the Gluster-users
mailing list