[Gluster-devel] spurios failures in tests/encryption/crypt.t

Pranith Kumar Karampuri pkarampu at redhat.com
Wed May 21 11:59:08 UTC 2014



----- Original Message -----
> From: "Edward Shishkin" <edward at redhat.com>
> To: "Anand Avati" <avati at gluster.org>
> Cc: "Pranith Kumar Karampuri" <pkarampu at redhat.com>, "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Wednesday, May 21, 2014 5:13:16 PM
> Subject: Re: [Gluster-devel] spurios failures in tests/encryption/crypt.t
> 
> On Wed, 21 May 2014 00:06:22 -0700
> Anand Avati <avati at gluster.org> wrote:
> 
> > On Tue, May 20, 2014 at 10:54 PM, Pranith Kumar Karampuri <
> > pkarampu at redhat.com> wrote:
> > 
> > >
> > >
> > > ----- Original Message -----
> > > > From: "Anand Avati" <avati at gluster.org>
> > > > To: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> > > > Cc: "Edward Shishkin" <edward at redhat.com>, "Gluster Devel" <
> > > gluster-devel at gluster.org>
> > > > Sent: Wednesday, May 21, 2014 10:53:54 AM
> > > > Subject: Re: [Gluster-devel] spurios failures in
> > > > tests/encryption/crypt.t
> > > >
> > > > There are a few suspicious things going on here..
> > > >
> > > > On Tue, May 20, 2014 at 10:07 PM, Pranith Kumar Karampuri <
> > > > pkarampu at redhat.com> wrote:
> > > >
> > > > >
> > > > > > > hi,
> > > > > > >      crypt.t is failing regression builds once in a while
> > > > > > > and most
> > > of
> > > > > > > the times it is because of the failures just after the
> > > > > > > remount in
> > > the
> > > > > > > script.
> > > > > > >
> > > > > > > TEST rm -f $M0/testfile-symlink
> > > > > > > TEST rm -f $M0/testfile-link
> > > > > > >
> > > > > > > Both of these are failing with ENOTCONN. I got a chance to
> > > > > > > look at the logs. According to the brick logs, this is what
> > > > > > > I see: [2014-05-17 05:43:43.363979] E
> > > > > > > [posix.c:2272:posix_open] 0-patchy-posix: open
> > > > > > > on /d/backends/patchy1/testfile-symlink: Transport endpoint
> > > > > > > is not connected
> > > > >
> > > >
> > > > posix_open() happening on a symlink? This should NEVER happen.
> > > > glusterfs itself should NEVER EVER by triggering symlink
> > > > resolution on the server.
> > > In
> > > > this case, for whatever reason an open() is attempted on a
> > > > symlink, and
> > > it
> > > > is getting followed back onto gluster's own mount point (test
> > > > case is creating an absolute link).
> > > >
> > > > So first find out: who is triggering fop->open() on a symlink.
> > > > Fix the caller.
> > > >
> > > > Next: add a check in posix_open() to fail with ELOOP or EINVAL if
> > > > the
> > > inode
> > > > is a symlink.
> > >
> > > I think I understood what you are saying. Open call for symlink on
> > > fuse mount lead to an open call again for the target on the same
> > > fuse mount.
> > 
> > 
> > It's not that simple. The client VFS is intelligent enough to resolve
> > symlinks and send open() only on non-symlinks. And the test case
> > script was doing an obvious unlink() (TEST rm -f <filename>), so it
> > was not initiated by an open() attempt in the first place. My guess
> > is that some xlator (probably crypt?) is doing an open() on an inode
> 
> 
> Ah, it is quite possible, that it is the crypt.. I'll take a look.
> Thanks for the hint, I stupidly increased the testcases without chances
> to reproduce the problem..

I just sent the patch at http://review.gluster.org/7824

Pranith

> 
> 
> > and that is going through unchecked in posix. It is a bug in both the
> > caller and posix, but the onus/responsibility is on posix to disallow
> > open() on anything but regular files (even open() on character or
> > block devices should not happen in posix).
> > 
> > 
> > 
> > > Which lead to deadlock :). That is why we disallow opens on symlink
> > > in gluster?
> > >
> > 
> > That's not just why open on symlink is disallowed in gluster, it is a
> > more generic problem of following symlinks in general inside gluster.
> > Symlink resolution must strictly happen only in the outermost VFS.
> > Following symlinks inside the filesystem is not only an invalid
> > operation, but can lead to all kinds of deadlocks, security holes
> > (what if you opened a symlink which points to /etc/passwd, should it
> > show the contents of the client machine's /etc/passwd or the server?
> > Now what if you wrote to the file through the symlink? etc. you get
> > the idea..) and wrong/weird/dangerous behaviors. This is not just
> > related to following symlinks, even open()ing special devices.. e.g
> > if you create a char device file with major/minor number of an audio
> > device and wrote pcm data into it, should it play music on the client
> > machine or in the server machine? etc. The summary is, following
> > symlinks or opening non-regular files is VFS/client operation and are
> > invalid operations in a filesystem context.
> 
> 


More information about the Gluster-devel mailing list