[Bugs] [Bug 1398602] New: DHT hash ranges are not distributed across two subvols

bugzilla at redhat.com bugzilla at redhat.com
Fri Nov 25 11:56:59 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1398602

            Bug ID: 1398602
           Summary: DHT hash ranges are not distributed across two subvols
           Product: GlusterFS
           Version: 3.8
         Component: dht2
          Assignee: bugs at gluster.org
          Reporter: shberry at redhat.com
                CC: bugs at gluster.org



Description of problem:

50 Files created on 2*3 replica 3 volume with sharding enabled hashes to just
one sub-vol. No files  present on bricks of other sub-vols.
The shards though are present on both the sub-vols under .shard.

Now if a directory is created on the same volume and files are created under
this directory, these files hash to both sub-vols.

Version-Release number of selected component (if applicable):

glusterfs-server-3.8.5-1.el7.x86_64
glusterfs-client-xlators-3.8.5-1.el7.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-libs-3.8.5-1.el7.x86_64
glusterfs-api-3.8.5-1.el7.x86_64
glusterfs-fuse-3.8.5-1.el7.x86_64
glusterfs-3.8.5-1.el7.x86_64
glusterfs-cli-3.8.5-1.el7.x86_64


How reproducible:

It happened twice in my setup


Steps to Reproduce:

1) Create a 2*3 replica 3 volume.
2) enable features.shard on it.
3) mount the volume to a fuse.client
4) Create files on it. The files goes to just one sub-volume.
5) Create a Dir on the volume. Create files under Dir. The files are
distributed across both the sub-vols 

Actual results:

gluster v info

Volume Name: rep3
Type: Distributed-Replicate
Volume ID: 5ec710ea-b3db-4a08-adec-7107b38af08b
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 172.17.40.13:/bricks/b01/g
Brick2: 172.17.40.14:/bricks/b01/g
Brick3: 172.17.40.15:/bricks/b01/g
Brick4: 172.17.40.16:/bricks/b01/g
Brick5: 172.17.40.22:/bricks/b01/g
Brick6: 172.17.40.24:/bricks/b01/g
Options Reconfigured:
features.shard: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on

As you see here the files are just present in first 3 server (1st sub-vol) and
no files present in 2nd sub-vol

par-for-all.sh svrs.list 'ls /bricks/b01/g/'
log directory is /tmp/par-for-all.6662


--- 172.17.40.13 ---
172.17.10.14.job1.0.0
172.17.10.14.job1.1.0
172.17.10.14.job1.2.0
172.17.10.14.job1.3.0
172.17.10.14.job1.4.0
172.17.10.14.job1.5.0
172.17.10.14.job1.6.0
172.17.10.14.job1.7.0
172.17.10.14.job1.8.0
172.17.10.14.job1.9.0
172.17.10.15.job1.0.0
172.17.10.15.job1.1.0
172.17.10.15.job1.2.0
172.17.10.15.job1.3.0
172.17.10.15.job1.4.0
172.17.10.15.job1.5.0
172.17.10.15.job1.6.0
172.17.10.15.job1.7.0
172.17.10.15.job1.8.0
172.17.10.15.job1.9.0
172.17.10.29.job1.0.0
172.17.10.29.job1.1.0
172.17.10.29.job1.2.0
172.17.10.29.job1.3.0
172.17.10.29.job1.4.0
172.17.10.29.job1.5.0
172.17.10.29.job1.6.0
172.17.10.29.job1.7.0
172.17.10.29.job1.8.0
172.17.10.29.job1.9.0
172.17.10.30.job1.0.0
172.17.10.30.job1.1.0
172.17.10.30.job1.2.0
172.17.10.30.job1.3.0
172.17.10.30.job1.4.0
172.17.10.30.job1.5.0
172.17.10.30.job1.6.0
172.17.10.30.job1.7.0
172.17.10.30.job1.8.0
172.17.10.30.job1.9.0
172.17.10.79.job1.0.0
172.17.10.79.job1.1.0
172.17.10.79.job1.2.0
172.17.10.79.job1.3.0
172.17.10.79.job1.4.0
172.17.10.79.job1.5.0
172.17.10.79.job1.6.0
172.17.10.79.job1.7.0
172.17.10.79.job1.8.0
172.17.10.79.job1.9.0


--- 172.17.40.14 ---
172.17.10.14.job1.0.0
172.17.10.14.job1.1.0
172.17.10.14.job1.2.0
172.17.10.14.job1.3.0
172.17.10.14.job1.4.0
172.17.10.14.job1.5.0
172.17.10.14.job1.6.0
172.17.10.14.job1.7.0
172.17.10.14.job1.8.0
172.17.10.14.job1.9.0
172.17.10.15.job1.0.0
172.17.10.15.job1.1.0
172.17.10.15.job1.2.0
172.17.10.15.job1.3.0
172.17.10.15.job1.4.0
172.17.10.15.job1.5.0
172.17.10.15.job1.6.0
172.17.10.15.job1.7.0
172.17.10.15.job1.8.0
172.17.10.15.job1.9.0
172.17.10.29.job1.0.0
172.17.10.29.job1.1.0
172.17.10.29.job1.2.0
172.17.10.29.job1.3.0
172.17.10.29.job1.4.0
172.17.10.29.job1.5.0
172.17.10.29.job1.6.0
172.17.10.29.job1.7.0
172.17.10.29.job1.8.0
172.17.10.29.job1.9.0
172.17.10.30.job1.0.0
172.17.10.30.job1.1.0
172.17.10.30.job1.2.0
172.17.10.30.job1.3.0
172.17.10.30.job1.4.0
172.17.10.30.job1.5.0
172.17.10.30.job1.6.0
172.17.10.30.job1.7.0
172.17.10.30.job1.8.0
172.17.10.30.job1.9.0
172.17.10.79.job1.0.0
172.17.10.79.job1.1.0
172.17.10.79.job1.2.0
172.17.10.79.job1.3.0
172.17.10.79.job1.4.0
172.17.10.79.job1.5.0
172.17.10.79.job1.6.0
172.17.10.79.job1.7.0
172.17.10.79.job1.8.0
172.17.10.79.job1.9.0

--- 172.17.40.15 ---
172.17.10.14.job1.0.0
172.17.10.14.job1.1.0
172.17.10.14.job1.2.0
172.17.10.14.job1.3.0
172.17.10.14.job1.4.0
172.17.10.14.job1.5.0
172.17.10.14.job1.6.0
172.17.10.14.job1.7.0
172.17.10.14.job1.8.0
172.17.10.14.job1.9.0
172.17.10.15.job1.0.0
172.17.10.15.job1.1.0
172.17.10.15.job1.2.0
172.17.10.15.job1.3.0
172.17.10.15.job1.4.0
172.17.10.15.job1.5.0
172.17.10.15.job1.6.0
172.17.10.15.job1.7.0
172.17.10.15.job1.8.0
172.17.10.15.job1.9.0
172.17.10.29.job1.0.0
172.17.10.29.job1.1.0
172.17.10.29.job1.2.0
172.17.10.29.job1.3.0
172.17.10.29.job1.4.0
172.17.10.29.job1.5.0
172.17.10.29.job1.6.0
172.17.10.29.job1.7.0
172.17.10.29.job1.8.0
172.17.10.29.job1.9.0
172.17.10.30.job1.0.0
172.17.10.30.job1.1.0
172.17.10.30.job1.2.0
172.17.10.30.job1.3.0
172.17.10.30.job1.4.0
172.17.10.30.job1.5.0
172.17.10.30.job1.6.0
172.17.10.30.job1.7.0
172.17.10.30.job1.8.0
172.17.10.30.job1.9.0
172.17.10.79.job1.0.0
172.17.10.79.job1.1.0
172.17.10.79.job1.2.0
172.17.10.79.job1.3.0
172.17.10.79.job1.4.0
172.17.10.79.job1.5.0
172.17.10.79.job1.6.0
172.17.10.79.job1.7.0
172.17.10.79.job1.8.0
172.17.10.79.job1.9.0


--- 172.17.40.16 ---


--- 172.17.40.22 ---


--- 172.17.40.24 ---

Here is the dht layout xattrs from my testing for the above case: (Note Brick
on server 14 was brought down as part of performance testing for self-heal)


root at ose3-master: ~ # par-for-all.sh svrs.list ' getfattr -e hex -m. -d
/bricks/b01/g'
log directory is /tmp/par-for-all.6396


--- 172.17.40.13 ---
# file: bricks/b01/g
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.rep3-client-1=0x000000000000000000000001
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x5ec710eab3db4a08adec7107b38af08b

getfattr: Removing leading '/' from absolute path names

--- 172.17.40.14 ---
# file: bricks/b01/g
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x5ec710eab3db4a08adec7107b38af08b

getfattr: Removing leading '/' from absolute path names

--- 172.17.40.15 ---
# file: bricks/b01/g
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.rep3-client-1=0x000000000000000000000001
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x5ec710eab3db4a08adec7107b38af08b

getfattr: Removing leading '/' from absolute path names

--- 172.17.40.16 ---
# file: bricks/b01/g
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.volume-id=0x5ec710eab3db4a08adec7107b38af08b

getfattr: Removing leading '/' from absolute path names

--- 172.17.40.22 ---
# file: bricks/b01/g
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.volume-id=0x5ec710eab3db4a08adec7107b38af08b

getfattr: Removing leading '/' from absolute path names

--- 172.17.40.24 ---
# file: bricks/b01/g
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.volume-id=0x5ec710eab3db4a08adec7107b38af08b

getfattr: Removing leading '/' from absolute path names


Now if a Dir is created on the volume and files added to it. The bricks would
look like this:

par-for-all.sh svrs.list 'ls /bricks/b01/g/dir/'
log directory is /tmp/par-for-all.6737


--- 172.17.40.13 ---
172.17.10.29.job1.0.0
172.17.10.29.job1.1.0
172.17.10.79.job1.1.0

--- 172.17.40.14 ---
pid 6748 on host 172.17.40.14 returns 2
ls: cannot access /bricks/b01/g/dir/: No such file or directory

--- 172.17.40.15 ---
172.17.10.29.job1.0.0
172.17.10.29.job1.1.0
172.17.10.79.job1.1.0

--- 172.17.40.16 ---
172.17.10.14.job1.0.0
172.17.10.14.job1.1.0
172.17.10.15.job1.0.0
172.17.10.15.job1.1.0
172.17.10.30.job1.0.0
172.17.10.30.job1.1.0
172.17.10.79.job1.0.0

--- 172.17.40.22 ---
172.17.10.14.job1.0.0
172.17.10.14.job1.1.0
172.17.10.15.job1.0.0
172.17.10.15.job1.1.0
172.17.10.30.job1.0.0
172.17.10.30.job1.1.0
172.17.10.79.job1.0.0

--- 172.17.40.24 ---
172.17.10.14.job1.0.0
172.17.10.14.job1.1.0
172.17.10.15.job1.0.0
172.17.10.15.job1.1.0
172.17.10.30.job1.0.0
172.17.10.30.job1.1.0
172.17.10.79.job1.0.0

Note above: Brick on server 14 was down as part of testing.

Here is the dht layout xattrs from my testing for the above case: (Note Brick
on server 14 was brought down as part of performance testing for self-heal)

 file: bricks/b01/g/dir
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.rep3-client-1=0x000000000000000200000004
trusted.gfid=0xf340c00e5ad44114a682ac4c5e0a7615
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe

getfattr: Removing leading '/' from absolute path names

--- 172.17.40.14 ---
pid 6502 on host 172.17.40.14 returns 1
getfattr: /bricks/b01/g/dir: No such file or directory

--- 172.17.40.15 ---
# file: bricks/b01/g/dir
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.rep3-client-1=0x000000000000000200000004
trusted.gfid=0xf340c00e5ad44114a682ac4c5e0a7615
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe

getfattr: Removing leading '/' from absolute path names

--- 172.17.40.16 ---
# file: bricks/b01/g/dir
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0xf340c00e5ad44114a682ac4c5e0a7615
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff

getfattr: Removing leading '/' from absolute path names

--- 172.17.40.22 ---
# file: bricks/b01/g/dir
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0xf340c00e5ad44114a682ac4c5e0a7615
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff

getfattr: Removing leading '/' from absolute path names

--- 172.17.40.24 ---
# file: bricks/b01/g/dir
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0xf340c00e5ad44114a682ac4c5e0a7615
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff

getfattr: Removing leading '/' from absolute path names



Expected results:


Additional info:

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list