[Gluster-devel] doubts in posix_handle_path and posix_handle_pump

Xavier Hernandez xhernandez at datalab.es
Thu Jun 5 09:26:31 UTC 2014


On Thursday 05 June 2014 01:32:07 Anand Avati wrote:
> On 6/3/14, 3:49 AM, Xavier Hernandez wrote:
> > On Tuesday 03 June 2014 15:42:19 Pranith Kumar Karampuri wrote:
> >> On 06/03/2014 02:42 PM, Xavier Hernandez wrote:
> >>> The possible problem I see is that in the comments it says that this
> >>> function returns a path to an IA_IFDIR (it will return IA_IFDIR on an
> >>> lstat), however if one of the symlinks is missing or anything else
> >>> fails,
> >>> it won't return an error *but* it will return a path to an existing
> >>> file.
> >>> An lstat on this path will return IA_IFLNK instead of IA_IFDIR. I don't
> >>> know if this can be a problem in some places.
> 
> I don't see how it could return a path which returns IA_IFLNK, no matter
> what. The very first path generated would be a path to a symlink. If it
> is missing, the very first pump iteration would fail, and we would be
> left with a path to a non-existent file. After that, at any poing we
> would either be left with a path which ELOOP's or points to a dir. If an
> intermediate symlink is missing, the entire path itself will give ENOENT
> on all operations. In any case we should never end up with a path
> pointing to IA_IFLNK.
> 
Right. I didn't understood the code as well as I thought. Now it seems clear. 
The only possibility to return a symlink path would be that the first 
posix_handle_pump() fails for some reason. If it fails because of readlink() 
failure, it should fail in the same way with any other system call. If it 
fails because of an invalid symlink structure, it would probably be due to an 
orphan symlink (it only exists inside the .glusterfs but it does not exist in 
the "real" filesystem). In any case I think it would be safe to return that 
path.


> >> This is exactly what I was referring to, I don't see an easy way to find
> >> out if there is any failure in the function. One needs to do extra lstat
> >> or a 'path' based syscall like getxattr etc on the returned path to
> >> check if it returned a good path. So do you think the best thing is to
> >> ignore the return value of the function call but depend on an lstat or a
> >> path based syscall of the path?
> 
> Directly calling getxattr() instead of lgetxattr() is a dangerous thing
> to do in posix. We should never, even accidentally, follow a user's
> symlink. Always use the lXXXX() variant syscall unless you really really
> know what you are doing.
> 
> > The only point to consider is for gfid's representing directories. Other
> > types of file do not have any problem (the returned path can be
> > considered valid even if lstat() fails). For directories there are 3
> > places where things can fail:
> > 
> > At line 360: I think this is not a problem. If lstat() fails (basically
> > because it does not exist), the returned path can be considered valid.
> > 
> > At line 367: If posix_handle_pump() fails, it could mean:
> > 
> > * The symlink is not a valid directory symlink:
> >    * it's a corrupted one: any operation on this file should be denied
> >    * it's a normal symlink that has lost one of the hard-links: though
> >    it's bad> 
> > to have damaged gfid's, the returned path can be considered valid.
> > 
> > * readlink() failed: this woulb be very weird. Access to the file should
> > be
> > denied.
> > 
> > At line 374: If lstat() fails, probably it means that the symlink of one
> > of
> > the parents of the directory is missing. The returned path won't fail on
> > lstat(), but it should because it will return symlink information instead
> > of directory information.
> > 
> > I think that it's very hard to determine if something went wrong only by
> > inspecting the returned path. I think the best approach would be that
> > posix_handle_path() return -1 if posix_handle_pump() or lstat() at line
> > 374
> > fail, and each caller decide what to do in case of failure.
> > 
> > However I don't know all the details of the posix xlator, so maybe I'm
> > wrong and this is not necessary. Let's see if there is someone else with
> > more experience on it to see what he thinks.
> 
> We could add the above check and return -1. But note that the point of
> the function is to create a path for a certain operation. But the
> operation itself will anyways fail with the desired errno if the path
> could not be "pumped up" (for whatever reason) or if an intermediate
> symlink is missing.
> 
Having understood the initial problem and having seen that it's not really a 
problem, I think it would be better to assume that this function never fails. 
It's easier to write code this way.

All this will work as long as the returned path is not used to create files or 
directories (this was one of the causes of bug 859581). If we want to use the 
returned path for creating things, we need to return an error condition if 
something fails. If we only want to do read or modify operations on the 
returned path, it should be safe.

As of now, I think it's ok as it is implemented.

Xavi


More information about the Gluster-devel mailing list