[Gluster-devel] bug with TLA 313?

Anand Avati avati at zresearch.com
Fri Jul 20 20:52:29 UTC 2007


Brent,
 there was a bug in setxattr, of the length getting calculated by -1 for
(non ascii) binary values of setxattr. can you please check if your cp goes
through now? I'm very sorry I am unable to test this ourselves since we dont
have a system which uses posix acls, though xattrs are now working fine on
binary data (before the fix it was working only for pure ascii data only)

thanks,
avati

2007/7/20, Brent A Nelson <brent at phys.ufl.edu>:
>
> Nope, it's still there.  Example strace snippet:
>
> setxattr("/beast/glusterfs/beast", "system.posix_acl_access",
>
> "\x02\x00\x00\x00\x01\x00\x06\x00\xff\xff\xff\xff\x04\x00\x04\x00\xff\xff\xff\xff
> \x00\x04\x00\xff\xff\xff\xff", 28, 0) = -1 EINVAL (Invalid argument)
>
> It presumably should have returned EOPNOTSUPP (Operation not supported),
> instead.
>
> Thanks,
>
> Brent
>
> On Fri, 20 Jul 2007, Anand Avati wrote:
>
> > Brent,
> > there was a fix in fuse_setxattr in patch-325. please check if it fixes
> > your issue. AFR was only reporting the errno's passing via it.
> >
> > thanks,
> > avati
> >
> > 2007/7/20, Brent A Nelson <brent at phys.ufl.edu>:
> >>
> >> I should point out that this was with the full (AFR/unify) setup, not
> the
> >> stripped-down setup.  I also get a lot of messages such as the
> following
> >> in /var/log/glusterfs/glusterfs.log:
> >> 2007-07-19 15:19:28 E [afr.c:514:afr_setxattr_cbk] mirror4: (path=/usr0
> >> child=share4-0) op_ret=-1 op_errno=22
> >> 2007-07-19 15:19:28 E [afr.c:514:afr_setxattr_cbk] mirror0: (path=/usr0
> >> child=share0-0) op_ret=-1 op_errno=22
> >> 2007-07-19 15:57:17 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> op_errno=61
> >> 2007-07-19 15:57:17 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> op_errno=61
> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> op_errno=61
> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> op_errno=61
> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror6:
> >> (path=/nfs/share/locale/cs child=share6-0) op_ret=-1 op_errno=61
> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror6:
> >> (path=/nfs/share/locale/cs child=share6-0) op_ret=-1 op_errno=61
> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> op_errno=61
> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> op_errno=61
> >>
> >> Thanks,
> >>
> >> Brent
> >>
> >> On Thu, 19 Jul 2007, Brent A Nelson wrote:
> >>
> >> > Patch 322 seems to have fixed the stray ls errors, but not the cp -a
> >> > complaints.  A "cp -a" strace is attached.
> >> >
> >> > Thanks,
> >> >
> >> > Brent
> >> >
> >> > On Wed, 18 Jul 2007, Brent A Nelson wrote:
> >> >
> >> >> Aha, it looks like GlusterFS is giving odd/varying error responses
> to
> >> >> queries for ACL information (I assume it should be giving an
> "operation
> >> not
> >> >> supported" error).  This must be related to my previously reported
> >> problem
> >> >> copying from GlusterFS to GlusterFS where it was complaining about
> >> >> preserving ACLs for every file copied.
> >> >>
> >> >> See attached strace.
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Brent
> >> >>
> >> >> PS At least in this simple case where glusterfs is directly mounting
> a
> >> >> storage/posix, NFS reexport works fine. I haven't had a chance to
> test
> >> a
> >> >> full setup with recent GlusterFS tlas, but I will once the ACL
> glitch
> >> is
> >> >> squashed.
> >> >>
> >> >> On Wed, 18 Jul 2007, Anand Avati wrote:
> >> >>
> >> >>> Brent,
> >> >>> very interesting diagnosis! is it possible for you to re-create the
> >> 'posix
> >> >>> only' setup (no server/client) and again do 'strace ls -ial /beast'
> ?
> >> we
> >> >>> are
> >> >>> not able to reproduce this error at our setup.
> >> >>>
> >> >>> thanks
> >> >>> avati
> >> >>>
> >> >>> 2007/7/17, Brent A Nelson <brent at phys.ufl.edu>:
> >> >>>>
> >> >>>> Just a quick note that this doesn't seem to be any sort of
> corruption
> >> >>>> issue.  I completely emptied all my shares (even removing
> lost+found)
> >> and
> >> >>>> my namespace and rsynced the corresponding AFR shares and
> >> namespace.  The
> >> >>>> only thing different between the AFRs would be ctimes.
> >> >>>>
> >> >>>> I restarted everything, and did:
> >> >>>> ls -al /beast
> >> >>>> ls: /beast: File exists
> >> >>>> ls: /beast/.: File exists
> >> >>>> total 8
> >> >>>> drwxr-xr-x  2 root root 4096 2007-07-17 09:27 .
> >> >>>> drwxr-xr-x 27 root root 4096 2007-07-02 10:18 ..
> >> >>>>
> >> >>>> I also tried disabling readahead and writebehind (my only
> performance
> >> >>>> translators).  It didn't help.  Changing the unify from alu to rr
> >> also
> >> >>>> didn't help.
> >> >>>>
> >> >>>> I then tried "glusterfs -f /etc/glusterfs/beast -n mirror0 /beast"
> to
> >> >>>> mount a single AFR, no unify.  It STILL produces the same
> messages.
> >> >>>>
> >> >>>> I then tried "glusterfs -f /etc/glusterfs/beast -n share0-0
> /beast"
> >> to
> >> >>>> mount a simple, single share used as half of an AFR.  Same issue.
> >> >>>>
> >> >>>> I then stripped down a server to serve out one single
> storage/posix
> >> >>>> share,
> >> >>>> with no posix locks (I wasn't using any other translators on the
> >> server
> >> >>>> side, apart from protocol/server, of course).  I mounted that
> share
> >> as in
> >> >>>> the previous attempt.  No difference!
> >> >>>>
> >> >>>> So, this issue occurs even with just protocol/client,
> >> protocol/server,
> >> >>>> and
> >> >>>> storage/posix in use.  As barebones as you can get.  Almost.
> >> >>>>
> >> >>>> One more try.  No glusterfsd, and glusterfs accesses a single
> >> >>>> storage/posix directly:
> >> >>>>
> >> >>>> ls -al /beast
> >> >>>> ls: /beast: File exists
> >> >>>> ls: /beast/.: File exists
> >> >>>> total 8
> >> >>>> drwxr-xr-x  2 root root 4096 2007-07-17 09:27 .
> >> >>>> drwxr-xr-x 27 root root 4096 2007-07-02 10:18 ..
> >> >>>>
> >> >>>> No difference, even with just glusterfs directly accessing a
> single,
> >> >>>> local
> >> >>>> storage/posix, with no other translators.  Spec is simply:
> >> >>>>
> >> >>>> volume share0
> >> >>>>    type storage/posix                   # POSIX FS translator
> >> >>>>    option directory /share0             # Export this directory
> >> >>>> end-volume
> >> >>>>
> >> >>>> Ubuntu Feisty, Fuse 2.6.3.
> >> >>>>
> >> >>>> Any ideas?
> >> >>>>
> >> >>>> Thanks,
> >> >>>>
> >> >>>> Brent
> >> >>>>
> >> >>>>
> >> >>>> On Sat, 14 Jul 2007, Brent A Nelson wrote:
> >> >>>>
> >> >>>> > It's the same spec I was using previously (AFRed namespace
> cache,
> >> >>>> unified
> >> >>>> > AFRs spread across four servers, posix-locks, readahead, and
> >> >>>> writebehind).
> >> >>>> > It's not just the top-level directory; it's everywhere.
> >> >>>> >
> >> >>>> > Thanks,
> >> >>>> >
> >> >>>> > Brent
> >> >>>> >
> >> >>>> > On Sat, 14 Jul 2007, Anand Avati wrote:
> >> >>>> >
> >> >>>> >> Brent,
> >> >>>> >> this is strange, we are having patch-313 work pretty smooth so
> >> far.
> >> >>>> are
> >> >>>> >> there any changes in your spec? is this behaviour seen only in
> >> this
> >> >>>> >> particular directory or 'anywhere' in general? please attach
> your
> >> spec
> >> >>>> so
> >> >>>> >> that we can try to reproduce it in our labs.
> >> >>>> >>
> >> >>>> >> thanks,
> >> >>>> >> avati
> >> >>>> >>
> >> >>>> >> 2007/7/14, Brent A Nelson <brent at phys.ufl.edu>:
> >> >>>> >>>
> >> >>>> >>> Updating to the latest TLA patch, I got odd issues just with
> >> "ls":
> >> >>>> >>>
> >> >>>> >>> Example:
> >> >>>> >>>
> >> >>>> >>> ls -al /beast/
> >> >>>> >>> ls: /beast/: No such file or directory
> >> >>>> >>> ls: /beast/.: No such file or directory
> >> >>>> >>> ls: /beast/lost+found: No such file or directory
> >> >>>> >>> ls: /beast/usr0: No such file or directory
> >> >>>> >>> ls: /beast/usr: No such file or directory
> >> >>>> >>> total 32
> >> >>>> >>> drwxr-xr-x  5 root root  4096 2007-07-13 16:18 .
> >> >>>> >>> drwxr-xr-x 27 root root  4096 2007-06-25 18:34 ..
> >> >>>> >>> drwx------  2 root root 16384 2007-06-25 17:08 lost+found
> >> >>>> >>> drwxr-xr-x 10 root root  4096 2007-06-18 13:31 usr
> >> >>>> >>> drwxr-xr-x 10 root root  4096 2007-06-18 13:31 usr0
> >> >>>> >>>
> >> >>>> >>> I have one machine that is no longer returning from an
> "ls".  I
> >> get
> >> >>>> other
> >> >>>> >>> messages sometimes, not just "No such file or directory", but
> >> also
> >> >>>> "Bad
> >> >>>> >>> file descriptor" or even "File exists".  These extraneous
> >> messages
> >> >>>> are
> >> >>>> >>> also occurring when copying from the GlusterFS to the
> >> GlusterFS.  The
> >> >>>> >>> files and directories mentioned do, in fact, exist, no matter
> >> what
> >> >>>> the
> >> >>>> >>> extraneous error message says.
> >> >>>> >>>
> >> >>>> >>> Is there a known issue with the current patchset?
> >> >>>> >>>
> >> >>>> >>> Thanks,
> >> >>>> >>>
> >> >>>> >>> Brent
> >> >>>> >>>
> >> >>>> >>>
> >> >>>> >>> _______________________________________________
> >> >>>> >>> Gluster-devel mailing list
> >> >>>> >>> Gluster-devel at nongnu.org
> >> >>>> >>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >> >>>> >>>
> >> >>>> >>
> >> >>>> >>
> >> >>>> >>
> >> >>>> >> --
> >> >>>> >> Anand V. Avati
> >> >>>> >>
> >> >>>> >
> >> >>>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Anand V. Avati
> >> >
> >>
> >
> >
> >
> > --
> > Anand V. Avati
> >
>



-- 
Anand V. Avati



More information about the Gluster-devel mailing list