[Gluster-users] Fuse client dying after "gfid different on subvolume" ?

Marc Seeger marc.seeger at acquia.com
Wed Jun 5 15:51:23 UTC 2013


And another one:
[2013-06-05 09:39:23.281555] W [afr-common.c:1196:afr_detect_self_heal_by_iatt] 0-test-fs-cluster-1-replicate-0: /home/qarshared78/.drush/qarshared78.aliases.drushrc.php.lock: gfid different on subvolume
[2013-06-05 09:39:23.281555] I [afr-self-heal-common.c:1970:afr_sh_post_nb_entrylk_gfid_sh_cbk] 0-test-fs-cluster-1-replicate-0: Non blocking entrylks failed.
[2013-06-05 09:39:23.281555] W [inode.c:914:inode_lookup] (-->/usr/lib/glusterfs/3.3.2qa3/xlator/debug/io-stats.so(io_stats_lookup_cbk+0xff) [0x7fe9c8481d8f] (-->/usr/lib/glusterfs/3.3.2qa3/xlator/mount/fuse.so(+0xf248) [0x7fe9cbc06248] (-->/usr/lib/glusterfs/3.3.2qa3/xlator/mount/fuse.so(+0xf0b1) [0x7fe9cbc060b1]))) 0-fuse: inode not found
Unmounting and remounting fixes the problem, but until then the volume mount doesn't responding anymore.

:-/

On Jun 3, 2013, at 11:07 AM, Marc Seeger <marc.seeger at acquia.com> wrote:

> Hey gluster-users,
> I just stumbled on a problem in our current test-setup of gluster 3.3.2.
> 
> This is a simple replicated setup with 2 bricks (on XFS) in 1 volume running on glusterfs version 3.3.2qa3 on ubuntu lucid.
> The client mounting this volume on /mnt/gfs sits on a mother machine and is using fuse (Version: 2.8.1-1.1ubuntu3.1).
> 
> On the gluster-fs fuse client mount log:
> [2013-06-02 21:23:26.677069] W [afr-common.c:1196:afr_detect_self_heal_by_iatt] 0-test-fs-cluster-1-replicate-0: /home/filesshared/README.txt.lock: gfid different on subvolume
> [2013-06-02 21:23:26.677069] I [afr-self-heal-common.c:1970:afr_sh_post_nb_entrylk_gfid_sh_cbk] 0-test-fs-cluster-1-replicate-0: Non blocking entrylks failed.
> [2013-06-02 21:23:26.697068] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-test-fs-cluster-1-client-0: remote operation failed: File exists. Path: /home/filesshared/README.txt.lock (00000000-0000-0000-0000-000000000000)
> [2013-06-02 21:23:26.697068] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-test-fs-cluster-1-client-1: remote operation failed: File exists. Path: /home/filesshared/README.txt.lock (00000000-0000-0000-0000-000000000000)
> [2013-06-02 21:23:26.697068] W [inode.c:914:inode_lookup] (-->/usr/lib/glusterfs/3.3.2qa3/xlator/debug/io-stats.so(io_stats_lookup_cbk+0xff) [0x7fb16c310d8f] (-->/usr/lib/glusterfs/3.3.2qa3/xlator/mount/fuse.so(+0xf248) [0x7fb16fa95248] (-->/usr/lib/glusterfs/3.3.2qa3/xlator/mount/fuse.so(+0xf0b1) [0x7fb16fa950b1]))) 0-fuse: inode not found
> 
> 
> What the application side is doing when this happened:
> 1. It created /home/filesshared
> 2. creates /mnt/gfs/home/filesshared
> 3. deleted /home/filesshared and replaced it with a symlink from /home/filesshared to /mnt/gfs/home/filesshared
> 4. Tried to write some files
> 
> Here's the log for that:
> 2013-06-02T21:23:26+00:00 daemon.notice web-14 f-c-w[4842]: deploying filesshared.prod
> 2013-06-02T21:23:26+00:00 daemon.notice web-14 f-c-w[4842]: creating directory: dir=/home/filesshared, user=0, group=filesshared, mode=0550
> 2013-06-02T21:23:26+00:00 daemon.notice web-14 f-c-w[4842]: creating directory: dir=/mnt/gfs/home/filesshared, user=filesshared, group=filesshared, mode=0700
> 2013-06-02T21:23:26+00:00 daemon.notice web-14 f-c-w[4842]: created /home/filesshared -> /mnt/gfs/home/filesshared
> 2013-06-02T21:23:26+00:00 daemon.notice web-14 f-c-w[4842]: PHP Warning:  stat(): stat failed for /home/filesshared/README.txt.lock in /usr/ah/lib/ah-lib.php on line 701
> 2013-06-02T21:23:27+00:00 daemon.notice web-14 f-c-w[4842]: PHP Warning:  stat(): stat failed for /home/filesshared/README.txt.lock in /usr/ah/lib/ah-lib.php on line 701
> 2013-06-02T21:23:27+00:00 daemon.notice web-14 f-c-w[4842]: PHP Warning:  stat(): stat failed for /home/filesshared/README.txt.lock in /usr/ah/lib/ah-lib.php on line 701
> 2013-06-02T21:23:28+00:00 daemon.notice web-14 f-c-w[4842]: PHP Warning:  stat(): stat failed for /home/filesshared/README.txt.lock in /usr/ah/lib/ah-lib.php on line 701
> 
> What this resulted in:
> This turned the mount point completely unresponsive.
> This means that in PHP, file_exists('/mnt/gfs') returns false and stat() calls fail. In Ruby File.directory?('/mnt/gfs') returns false.
> This can be solved by calling "umount /mnt/gfs" and then remounting the share again from fstab ("mount /mnt/gfs")
> 
> I could not find any relevant log entries on the bricks themselves. I sadly also wasn't able to come up with a test case to reproduce it.
> 
> It seems somewhat similar to http://gluster.org/pipermail/gluster-users/2013-March/035662.html
> I initially thought that this could have been fixed in http://review.gluster.org/#/c/4689/ , but the qa branch we run has this fix backported.
> 
> Any idea what could cause this behaviour?
> 
> Cheers,
> Marc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130605/7e5ccb4f/attachment.html>


More information about the Gluster-users mailing list