[Gluster-devel] spurios failures in tests/encryption/crypt.t
Anand Avati
avati at gluster.org
Wed May 21 07:06:22 UTC 2014
On Tue, May 20, 2014 at 10:54 PM, Pranith Kumar Karampuri <
pkarampu at redhat.com> wrote:
>
>
> ----- Original Message -----
> > From: "Anand Avati" <avati at gluster.org>
> > To: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> > Cc: "Edward Shishkin" <edward at redhat.com>, "Gluster Devel" <
> gluster-devel at gluster.org>
> > Sent: Wednesday, May 21, 2014 10:53:54 AM
> > Subject: Re: [Gluster-devel] spurios failures in tests/encryption/crypt.t
> >
> > There are a few suspicious things going on here..
> >
> > On Tue, May 20, 2014 at 10:07 PM, Pranith Kumar Karampuri <
> > pkarampu at redhat.com> wrote:
> >
> > >
> > > > > hi,
> > > > > crypt.t is failing regression builds once in a while and most
> of
> > > > > the times it is because of the failures just after the remount in
> the
> > > > > script.
> > > > >
> > > > > TEST rm -f $M0/testfile-symlink
> > > > > TEST rm -f $M0/testfile-link
> > > > >
> > > > > Both of these are failing with ENOTCONN. I got a chance to look at
> > > > > the logs. According to the brick logs, this is what I see:
> > > > > [2014-05-17 05:43:43.363979] E [posix.c:2272:posix_open]
> > > > > 0-patchy-posix: open on /d/backends/patchy1/testfile-symlink:
> > > > > Transport endpoint is not connected
> > >
> >
> > posix_open() happening on a symlink? This should NEVER happen. glusterfs
> > itself should NEVER EVER by triggering symlink resolution on the server.
> In
> > this case, for whatever reason an open() is attempted on a symlink, and
> it
> > is getting followed back onto gluster's own mount point (test case is
> > creating an absolute link).
> >
> > So first find out: who is triggering fop->open() on a symlink. Fix the
> > caller.
> >
> > Next: add a check in posix_open() to fail with ELOOP or EINVAL if the
> inode
> > is a symlink.
>
> I think I understood what you are saying. Open call for symlink on fuse
> mount lead to an open call again for the target on the same fuse mount.
It's not that simple. The client VFS is intelligent enough to resolve
symlinks and send open() only on non-symlinks. And the test case script was
doing an obvious unlink() (TEST rm -f <filename>), so it was not initiated
by an open() attempt in the first place. My guess is that some xlator
(probably crypt?) is doing an open() on an inode and that is going through
unchecked in posix. It is a bug in both the caller and posix, but the
onus/responsibility is on posix to disallow open() on anything but regular
files (even open() on character or block devices should not happen in
posix).
> Which lead to deadlock :). That is why we disallow opens on symlink in
> gluster?
>
That's not just why open on symlink is disallowed in gluster, it is a more
generic problem of following symlinks in general inside gluster. Symlink
resolution must strictly happen only in the outermost VFS. Following
symlinks inside the filesystem is not only an invalid operation, but can
lead to all kinds of deadlocks, security holes (what if you opened a
symlink which points to /etc/passwd, should it show the contents of the
client machine's /etc/passwd or the server? Now what if you wrote to the
file through the symlink? etc. you get the idea..) and
wrong/weird/dangerous behaviors. This is not just related to following
symlinks, even open()ing special devices.. e.g if you create a char device
file with major/minor number of an audio device and wrote pcm data into it,
should it play music on the client machine or in the server machine? etc.
The summary is, following symlinks or opening non-regular files is
VFS/client operation and are invalid operations in a filesystem context.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140521/b035e71d/attachment.html>
More information about the Gluster-devel
mailing list