[Gluster-devel] bug with TLA 313?

Anand Avati avati at zresearch.com
Fri Jul 20 22:51:27 UTC 2007


337 fixes getfacl.

thanks for your patience!
avati

2007/7/21, Brent A Nelson <brent at phys.ufl.edu>:
>
> FYI, patch 336 also still has the issue.
>
> Thanks,
>
> Brent
>
> On Fri, 20 Jul 2007, Brent A Nelson wrote:
>
> > I tried patch 335 just before I got your message, as the description
> sounded
> > promising.  You were probably referring to a patch to follow that, but,
> just
> > in case, I wanted to let you know that the issue is still present in TLA
> 335.
> >
> > Thanks,
> >
> > Brent
> >
> > On Fri, 20 Jul 2007, Anand Babu Periasamy wrote:
> >
> >> Bug was fixed only for setfacl. Avati is fixing it for getfacl too.
> >> Next patch level will fix your issue.
> >> --
> >> Anand Babu Periasamy
> >>
> >> Brent A Nelson writes:
> >>
> >>> I also get the following in the glusterfs.log:
> >>> 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror2:
> >>> (path=/glusterfs/glusterfs-server.vol child=share2-0) op_ret=43
> op_errno=2
> >>> 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror1:
> >>> (path=/glusterfs/beast child=share1-1) op_ret=43 op_errno=2
> >>> 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror3:
> >>> (path=/glusterfs/glusterfs-client.vol.sample child=share3-1) op_ret=43
> >>> op_errno=2
> >>> 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror7:
> >>> (path=/glusterfs/glusterfs-server.vol.sample child=share7-1) op_ret=43
> >>> op_errno=2
> >>> 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror0:
> >>> (path=/glusterfs/share0 child=share0-0) op_ret=43 op_errno=2
> >>> 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror1:
> >>> (path=/glusterfs child=share1-1) op_ret=-1 op_errno=61
> >>> 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror7:
> >>> (path=/glusterfs2 child=share7-1) op_ret=-1 op_errno=95
> >>> 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror5:
> >>> (path=/glusterfs2 child=share5-1) op_ret=-1 op_errno=95
> >>> 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror6:
> >>> (path=/glusterfs2 child=share6-1) op_ret=-1 op_errno=95
> >>> 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror4:
> >>> (path=/glusterfs2 child=share4-1) op_ret=-1 op_errno=95
> >>> 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror1:
> >>> (path=/glusterfs child=share1-1) op_ret=-1 op_errno=61
> >>> 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror7:
> >>> (path=/glusterfs2 child=share7-1) op_ret=-1 op_errno=95
> >>> 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror5:
> >>> (path=/glusterfs2 child=share5-1) op_ret=-1 op_errno=95
> >>> 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror6:
> >>> (path=/glusterfs2 child=share6-1) op_ret=-1 op_errno=95
> >>> 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror4:
> >>> (path=/glusterfs2 child=share4-1) op_ret=-1 op_errno=95
> >>>
> >>> Thanks,
> >>>
> >>> Brent
> >>>
> >>> On Fri, 20 Jul 2007, Brent A Nelson wrote:
> >>>
> >>>> Copying from a local filesystem to the GlusterFS now works without
> issue,
> >>>> but copying from the GlusterFS to the GlusterFS still complains.  See
> >>>> attached strace.
> >>>>
> >>>> Note that my local filesystem is not mounted with the acl option, but
> the
> >>>> underlying mounts that make up my GlusterFS do have the acl mount
> option.
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Brent
> >>>>
> >>>> PS Are these fixes actually enabling support for ACLs? If they are,
> >>>> that's very cool and well ahead of the roadmap!
> >>>>
> >>>> On Sat, 21 Jul 2007, Anand Avati wrote:
> >>>>
> >>>>> Brent,
> >>>>> there was a bug in setxattr, of the length getting calculated by -1
> for
> >>>>> (non ascii) binary values of setxattr. can you please check if your
> cp
> >>>>> goes
> >>>>> through now? I'm very sorry I am unable to test this ourselves since
> we
> >>>>> dont
> >>>>> have a system which uses posix acls, though xattrs are now working
> fine
> >>>>> on
> >>>>> binary data (before the fix it was working only for pure ascii data
> >>>>> only)
> >>>>>
> >>>>> thanks,
> >>>>> avati
> >>>>>
> >>>>> 2007/7/20, Brent A Nelson <brent at phys.ufl.edu>:
> >>>>>>
> >>>>>> Nope, it's still there.  Example strace snippet:
> >>>>>>
> >>>>>> setxattr("/beast/glusterfs/beast", "system.posix_acl_access",
> >>>>>>
> >>>>>>
> "\x02\x00\x00\x00\x01\x00\x06\x00\xff\xff\xff\xff\x04\x00\x04\x00\xff\xff\xff\xff
> >>>>>> \x00\x04\x00\xff\xff\xff\xff", 28, 0) = -1 EINVAL (Invalid
> argument)
> >>>>>>
> >>>>>> It presumably should have returned EOPNOTSUPP (Operation not
> >>>>>> supported),
> >>>>>> instead.
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Brent
> >>>>>>
> >>>>>> On Fri, 20 Jul 2007, Anand Avati wrote:
> >>>>>>
> >>>>>> > Brent,
> >>>>>> > there was a fix in fuse_setxattr in patch-325. please check if it
> >>>>>> fixes
> >>>>>> > your issue. AFR was only reporting the errno's passing via it.
> >>>>>> >
> >>>>>> > thanks,
> >>>>>> > avati
> >>>>>> >
> >>>>>> > 2007/7/20, Brent A Nelson <brent at phys.ufl.edu>:
> >>>>>> >>
> >>>>>> >> I should point out that this was with the full (AFR/unify)
> setup,
> >>>>>> not
> >>>>>> the
> >>>>>> >> stripped-down setup.  I also get a lot of messages such as the
> >>>>>> following
> >>>>>> >> in /var/log/glusterfs/glusterfs.log:
> >>>>>> >> 2007-07-19 15:19:28 E [afr.c:514:afr_setxattr_cbk] mirror4:
> >>>>>> (path=/usr0
> >>>>>> >> child=share4-0) op_ret=-1 op_errno=22
> >>>>>> >> 2007-07-19 15:19:28 E [afr.c:514:afr_setxattr_cbk] mirror0:
> >>>>>> (path=/usr0
> >>>>>> >> child=share0-0) op_ret=-1 op_errno=22
> >>>>>> >> 2007-07-19 15:57:17 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >>>>>> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> >>>>>> op_errno=61
> >>>>>> >> 2007-07-19 15:57:17 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >>>>>> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> >>>>>> op_errno=61
> >>>>>> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >>>>>> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> >>>>>> op_errno=61
> >>>>>> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >>>>>> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> >>>>>> op_errno=61
> >>>>>> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror6:
> >>>>>> >> (path=/nfs/share/locale/cs child=share6-0) op_ret=-1 op_errno=61
> >>>>>> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror6:
> >>>>>> >> (path=/nfs/share/locale/cs child=share6-0) op_ret=-1 op_errno=61
> >>>>>> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >>>>>> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> >>>>>> op_errno=61
> >>>>>> >> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
> >>>>>> >> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
> >>>>>> op_errno=61
> >>>>>> >>
> >>>>>> >> Thanks,
> >>>>>> >>
> >>>>>> >> Brent
> >>>>>> >>
> >>>>>> >> On Thu, 19 Jul 2007, Brent A Nelson wrote:
> >>>>>> >>
> >>>>>> >> > Patch 322 seems to have fixed the stray ls errors, but not the
> cp
> >>>>>> -a
> >>>>>> >> > complaints.  A "cp -a" strace is attached.
> >>>>>> >> >
> >>>>>> >> > Thanks,
> >>>>>> >> >
> >>>>>> >> > Brent
> >>>>>> >> >
> >>>>>> >> > On Wed, 18 Jul 2007, Brent A Nelson wrote:
> >>>>>> >> >
> >>>>>> >> >> Aha, it looks like GlusterFS is giving odd/varying error
> >>>>>> responses
> >>>>>> to
> >>>>>> >> >> queries for ACL information (I assume it should be giving an
> >>>>>> "operation
> >>>>>> >> not
> >>>>>> >> >> supported" error).  This must be related to my previously
> >>>>>> reported
> >>>>>> >> problem
> >>>>>> >> >> copying from GlusterFS to GlusterFS where it was complaining
> >>>>>> about
> >>>>>> >> >> preserving ACLs for every file copied.
> >>>>>> >> >>
> >>>>>> >> >> See attached strace.
> >>>>>> >> >>
> >>>>>> >> >> Thanks,
> >>>>>> >> >>
> >>>>>> >> >> Brent
> >>>>>> >> >>
> >>>>>> >> >> PS At least in this simple case where glusterfs is directly
> >>>>>> mounting
> >>>>>> a
> >>>>>> >> >> storage/posix, NFS reexport works fine. I haven't had a
> chance to
> >>>>>> test
> >>>>>> >> a
> >>>>>> >> >> full setup with recent GlusterFS tlas, but I will once the
> ACL
> >>>>>> glitch
> >>>>>> >> is
> >>>>>> >> >> squashed.
> >>>>>> >> >>
> >>>>>> >> >> On Wed, 18 Jul 2007, Anand Avati wrote:
> >>>>>> >> >>
> >>>>>> >> >>> Brent,
> >>>>>> >> >>> very interesting diagnosis! is it possible for you to
> re-create
> >>>>>> the
> >>>>>> >> 'posix
> >>>>>> >> >>> only' setup (no server/client) and again do 'strace ls -ial
> >>>>>> /beast'
> >>>>>> ?
> >>>>>> >> we
> >>>>>> >> >>> are
> >>>>>> >> >>> not able to reproduce this error at our setup.
> >>>>>> >> >>>
> >>>>>> >> >>> thanks
> >>>>>> >> >>> avati
> >>>>>> >> >>>
> >>>>>> >> >>> 2007/7/17, Brent A Nelson <brent at phys.ufl.edu>:
> >>>>>> >> >>>>
> >>>>>> >> >>>> Just a quick note that this doesn't seem to be any sort of
> >>>>>> corruption
> >>>>>> >> >>>> issue.  I completely emptied all my shares (even removing
> >>>>>> lost+found)
> >>>>>> >> and
> >>>>>> >> >>>> my namespace and rsynced the corresponding AFR shares and
> >>>>>> >> namespace.  The
> >>>>>> >> >>>> only thing different between the AFRs would be ctimes.
> >>>>>> >> >>>>
> >>>>>> >> >>>> I restarted everything, and did:
> >>>>>> >> >>>> ls -al /beast
> >>>>>> >> >>>> ls: /beast: File exists
> >>>>>> >> >>>> ls: /beast/.: File exists
> >>>>>> >> >>>> total 8
> >>>>>> >> >>>> drwxr-xr-x  2 root root 4096 2007-07-17 09:27 .
> >>>>>> >> >>>> drwxr-xr-x 27 root root 4096 2007-07-02 10:18 ..
> >>>>>> >> >>>>
> >>>>>> >> >>>> I also tried disabling readahead and writebehind (my only
> >>>>>> performance
> >>>>>> >> >>>> translators).  It didn't help.  Changing the unify from alu
> to
> >>>>>> rr
> >>>>>> >> also
> >>>>>> >> >>>> didn't help.
> >>>>>> >> >>>>
> >>>>>> >> >>>> I then tried "glusterfs -f /etc/glusterfs/beast -n mirror0
> >>>>>> /beast"
> >>>>>> to
> >>>>>> >> >>>> mount a single AFR, no unify.  It STILL produces the same
> >>>>>> messages.
> >>>>>> >> >>>>
> >>>>>> >> >>>> I then tried "glusterfs -f /etc/glusterfs/beast -n share0-0
> >>>>>> /beast"
> >>>>>> >> to
> >>>>>> >> >>>> mount a simple, single share used as half of an AFR.  Same
> >>>>>> issue.
> >>>>>> >> >>>>
> >>>>>> >> >>>> I then stripped down a server to serve out one single
> >>>>>> storage/posix
> >>>>>> >> >>>> share,
> >>>>>> >> >>>> with no posix locks (I wasn't using any other translators
> on
> >>>>>> the
> >>>>>> >> server
> >>>>>> >> >>>> side, apart from protocol/server, of course).  I mounted
> that
> >>>>>> share
> >>>>>> >> as in
> >>>>>> >> >>>> the previous attempt.  No difference!
> >>>>>> >> >>>>
> >>>>>> >> >>>> So, this issue occurs even with just protocol/client,
> >>>>>> >> protocol/server,
> >>>>>> >> >>>> and
> >>>>>> >> >>>> storage/posix in use.  As barebones as you can
> get.  Almost.
> >>>>>> >> >>>>
> >>>>>> >> >>>> One more try.  No glusterfsd, and glusterfs accesses a
> single
> >>>>>> >> >>>> storage/posix directly:
> >>>>>> >> >>>>
> >>>>>> >> >>>> ls -al /beast
> >>>>>> >> >>>> ls: /beast: File exists
> >>>>>> >> >>>> ls: /beast/.: File exists
> >>>>>> >> >>>> total 8
> >>>>>> >> >>>> drwxr-xr-x  2 root root 4096 2007-07-17 09:27 .
> >>>>>> >> >>>> drwxr-xr-x 27 root root 4096 2007-07-02 10:18 ..
> >>>>>> >> >>>>
> >>>>>> >> >>>> No difference, even with just glusterfs directly accessing
> a
> >>>>>> single,
> >>>>>> >> >>>> local
> >>>>>> >> >>>> storage/posix, with no other translators.  Spec is simply:
> >>>>>> >> >>>>
> >>>>>> >> >>>> volume share0
> >>>>>> >> >>>>    type storage/posix                   # POSIX FS
> translator
> >>>>>> >> >>>>    option directory /share0             # Export this
> directory
> >>>>>> >> >>>> end-volume
> >>>>>> >> >>>>
> >>>>>> >> >>>> Ubuntu Feisty, Fuse 2.6.3.
> >>>>>> >> >>>>
> >>>>>> >> >>>> Any ideas?
> >>>>>> >> >>>>
> >>>>>> >> >>>> Thanks,
> >>>>>> >> >>>>
> >>>>>> >> >>>> Brent
> >>>>>> >> >>>>
> >>>>>> >> >>>>
> >>>>>> >> >>>> On Sat, 14 Jul 2007, Brent A Nelson wrote:
> >>>>>> >> >>>>
> >>>>>> >> >>>> > It's the same spec I was using previously (AFRed
> namespace
> >>>>>> cache,
> >>>>>> >> >>>> unified
> >>>>>> >> >>>> > AFRs spread across four servers, posix-locks, readahead,
> and
> >>>>>> >> >>>> writebehind).
> >>>>>> >> >>>> > It's not just the top-level directory; it's everywhere.
> >>>>>> >> >>>> >
> >>>>>> >> >>>> > Thanks,
> >>>>>> >> >>>> >
> >>>>>> >> >>>> > Brent
> >>>>>> >> >>>> >
> >>>>>> >> >>>> > On Sat, 14 Jul 2007, Anand Avati wrote:
> >>>>>> >> >>>> >
> >>>>>> >> >>>> >> Brent,
> >>>>>> >> >>>> >> this is strange, we are having patch-313 work pretty
> smooth
> >>>>>> so
> >>>>>> >> far.
> >>>>>> >> >>>> are
> >>>>>> >> >>>> >> there any changes in your spec? is this behaviour seen
> only
> >>>>>> in
> >>>>>> >> this
> >>>>>> >> >>>> >> particular directory or 'anywhere' in general? please
> attach
> >>>>>> your
> >>>>>> >> spec
> >>>>>> >> >>>> so
> >>>>>> >> >>>> >> that we can try to reproduce it in our labs.
> >>>>>> >> >>>> >>
> >>>>>> >> >>>> >> thanks,
> >>>>>> >> >>>> >> avati
> >>>>>> >> >>>> >>
> >>>>>> >> >>>> >> 2007/7/14, Brent A Nelson <brent at phys.ufl.edu>:
> >>>>>> >> >>>> >>>
> >>>>>> >> >>>> >>> Updating to the latest TLA patch, I got odd issues just
> >>>>>> with
> >>>>>> >> "ls":
> >>>>>> >> >>>> >>>
> >>>>>> >> >>>> >>> Example:
> >>>>>> >> >>>> >>>
> >>>>>> >> >>>> >>> ls -al /beast/
> >>>>>> >> >>>> >>> ls: /beast/: No such file or directory
> >>>>>> >> >>>> >>> ls: /beast/.: No such file or directory
> >>>>>> >> >>>> >>> ls: /beast/lost+found: No such file or directory
> >>>>>> >> >>>> >>> ls: /beast/usr0: No such file or directory
> >>>>>> >> >>>> >>> ls: /beast/usr: No such file or directory
> >>>>>> >> >>>> >>> total 32
> >>>>>> >> >>>> >>> drwxr-xr-x  5 root root  4096 2007-07-13 16:18 .
> >>>>>> >> >>>> >>> drwxr-xr-x 27 root root  4096 2007-06-25 18:34 ..
> >>>>>> >> >>>> >>> drwx------  2 root root 16384 2007-06-25 17:08
> lost+found
> >>>>>> >> >>>> >>> drwxr-xr-x 10 root root  4096 2007-06-18 13:31 usr
> >>>>>> >> >>>> >>> drwxr-xr-x 10 root root  4096 2007-06-18 13:31 usr0
> >>>>>> >> >>>> >>>
> >>>>>> >> >>>> >>> I have one machine that is no longer returning from an
> >>>>>> "ls".  I
> >>>>>> >> get
> >>>>>> >> >>>> other
> >>>>>> >> >>>> >>> messages sometimes, not just "No such file or
> directory",
> >>>>>> but
> >>>>>> >> also
> >>>>>> >> >>>> "Bad
> >>>>>> >> >>>> >>> file descriptor" or even "File exists".  These
> extraneous
> >>>>>> >> messages
> >>>>>> >> >>>> are
> >>>>>> >> >>>> >>> also occurring when copying from the GlusterFS to the
> >>>>>> >> GlusterFS.  The
> >>>>>> >> >>>> >>> files and directories mentioned do, in fact, exist, no
> >>>>>> matter
> >>>>>> >> what
> >>>>>> >> >>>> the
> >>>>>> >> >>>> >>> extraneous error message says.
> >>>>>> >> >>>> >>>
> >>>>>> >> >>>> >>> Is there a known issue with the current patchset?
> >>>>>> >> >>>> >>>
> >>>>>> >> >>>> >>> Thanks,
> >>>>>> >> >>>> >>>
> >>>>>> >> >>>> >>> Brent
> >>>>>> >> >>>> >>>
> >>>>>> >> >>>> >>>
> >>>>>> >> >>>> >>> _______________________________________________
> >>>>>> >> >>>> >>> Gluster-devel mailing list
> >>>>>> >> >>>> >>> Gluster-devel at nongnu.org
> >>>>>> >> >>>> >>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >>>>>> >> >>>> >>>
> >>>>>> >> >>>> >>
> >>>>>> >> >>>> >>
> >>>>>> >> >>>> >>
> >>>>>> >> >>>> >> --
> >>>>>> >> >>>> >> Anand V. Avati
> >>>>>> >> >>>> >>
> >>>>>> >> >>>> >
> >>>>>> >> >>>>
> >>>>>> >> >>>
> >>>>>> >> >>>
> >>>>>> >> >>>
> >>>>>> >> >>> --
> >>>>>> >> >>> Anand V. Avati
> >>>>>> >> >
> >>>>>> >>
> >>>>>> >
> >>>>>> >
> >>>>>> >
> >>>>>> > --
> >>>>>> > Anand V. Avati
> >>>>>> >
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Anand V. Avati
> >>>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Gluster-devel mailing list
> >>> Gluster-devel at nongnu.org
> >>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >>
> >
>



-- 
Anand V. Avati



More information about the Gluster-devel mailing list