[Bugs] [Bug 1196033] New: directory ownership says root as owner ship when the directories are created in parallel on two different mounts

bugzilla at redhat.com bugzilla at redhat.com
Wed Feb 25 06:55:31 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1196033

            Bug ID: 1196033
           Summary: directory ownership says root as owner ship when the
                    directories are created in parallel on two different
                    mounts
           Product: Red Hat Storage
           Version: 3.0
         Component: gluster-dht
          Severity: high
          Assignee: rhs-bugs at redhat.com
          Reporter: racpatel at redhat.com
        QA Contact: storage-qa-internal at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com,
                    pauyeung at shopzilla.com, pkarampu at redhat.com
        Depends On: 1138386



version:-
glusterfs-3.6.0.45-1


+++ This bug was initially created as a clone of Bug #1138386 +++

Description of problem:
Mail from Peter:
I have a replicated Gluster setup, 2 servers (fs-1 and fs-2) x 1 brick.  I have
two clients (also on fs-1 and fs-2) which mount the Gluster volume at /mnt/gfs
(/mnt/gfs type fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072)).  These clients have
scripts which perform various file operations.  One operation they perform
looks like this (note this is pseudocode, the actual script is PHP):

1. @mkdir(/mnt/gfs/somedir, 0550);
2. chown(1234, /mnt/gfs/somedir);
3. chgrp(1234, /mnt/gfs/somedir);

Note that line 1 may fail on either client because the directory may have been
created on the other client.  These errors are suppressed/ignored.  When this
operation is performed simultaneously on both clients, it usually succeeds in
creating a directory with the expected permissions and ownership. 
Intermittently however, we see that these directories are not owned by the
expected user and group.

I've created a PHP script which can be run on two clients simultaneously to
reproduce the error: https://gist.github.com/pdrakeweb/ae046b4c70a42309be43

The only log entry I can find that appears to be related is from fs-1's
mnt-gfs.log file:

[2014-08-22 12:27:57.661778] I [dht-layout.c:640:dht_layout_normalize]
0-test-fs-cluster-1-dht: found anomalies in /test-target/test1408710477.7.
holes=1 overlaps=0

This occurs in both Gluster 3.4.1 and 3.5.2 (the only two versions I have
tested for this).  I am unable to reproduce the problem on a local
(non-gluster) filesystem.  I'd appreciate any insight people might have into
what is going on here and whether this is a bug in Gluster.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Pranith Kumar K on 2014-09-04 13:18:43 EDT ---

I am able to reproduce the bug consistently. Disabling stat-prefetch reduced
the number of times the errors come but it hasn't eliminated the issue.

Following the strace output was interesting. The problem always seems to be
because the uid is not matching:
stat("/mnt/fuse1/test-target/test1409848960.3", {st_dev=makedev(0, 41),
st_ino=12165775161408537538, st_mode=S_IFDIR|0550, st_nlink=2, *st_uid=0*,
st_gid=9999, st_blksize=131072, st_blocks=1, st_size=6,
st_atime=2014/09/04-22:12:40, st_mtime=2014/09/04-22:12:40,
st_ctime=2014/09/04-22:12:40}) = 0

uid is coming as 0 and gid is 9999. If we do a stat after the run is over it is
showing things correctly.

--- Additional comment from Pranith Kumar K on 2014-09-04 13:27:57 EDT ---

The issue is not happening on plain distribute or replicate with no distribute
in the graph on my tests. Not sure why it is only happening with dht+afr. Will
update the bug once I find more.

--- Additional comment from Pranith Kumar K on 2014-09-05 06:17:44 EDT ---

RCA for the bug:
Mount-1: Creates a new directory uid:gid is 0:0
Mount-2: Tries to create a new directory fails with EEXIST
Mount-2: Does chown with uid as 9999 uid:gid at the end is 9999:0
Mount-1: Needs to set dht layout so triggers self-heal as part of that it sets
the uid:gid back to 0:0
mount-2: Does chown with gid as 9999 uid:gid at the end is 0:9999
mount-2: Gets uid:gid and gets 0:9999 instead of 9999:9999
mount-1: Does chown with uid as 9999 uid:gid at the end is 9999:9999
mount-1: Does chown with gid as 9999 uid:gid at the end is 9999:9999

I am not sure what exactly needs to be fixed in dht.

--- Additional comment from Pranith Kumar K on 2014-09-05 06:18:59 EDT ---

(In reply to Pranith Kumar K from comment #3)
> RCA for the bug:
> Mount-1: Creates a new directory uid:gid is 0:0
Mount-2: Tries to create the same directory above and fails with EEXIST
All the following operations happen on this same directory from here on
> Mount-2: Does chown with uid as 9999 uid:gid at the end is 9999:0
> Mount-1: Needs to set dht layout so triggers self-heal as part of that it
> sets the uid:gid back to 0:0
> mount-2: Does chown with gid as 9999 uid:gid at the end is 0:9999
> mount-2: Gets uid:gid and gets 0:9999 instead of 9999:9999
> mount-1: Does chown with uid as 9999 uid:gid at the end is 9999:9999
> mount-1: Does chown with gid as 9999 uid:gid at the end is 9999:9999
> 
> I am not sure what exactly needs to be fixed in dht.


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1138386
[Bug 1138386] directory ownership says root as owner ship when the
directories are created in parallel on two different mounts
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=sU6jIKcv8l&a=cc_unsubscribe


More information about the Bugs mailing list