[Bugs] [Bug 1707866] New: Thousands of duplicate files in glusterfs mountpoint directory listing
bugzilla at redhat.com
bugzilla at redhat.com
Wed May 8 15:11:44 UTC 2019
https://bugzilla.redhat.com/show_bug.cgi?id=1707866
Bug ID: 1707866
Summary: Thousands of duplicate files in glusterfs mountpoint
directory listing
Product: GlusterFS
Version: 4.1
Hardware: x86_64
OS: Linux
Status: NEW
Component: core
Severity: high
Assignee: bugs at gluster.org
Reporter: sergemp at mail.ru
CC: bugs at gluster.org
Target Milestone: ---
Classification: Community
I have something impossible: same filenames are listed multiple times:
# ls -la /mnt/VOLNAME/
...
-rwxrwxr-x 1 root root 3486 Jan 28 2016 check_connections.pl
-rwxr-xr-x 1 root root 153 Dec 7 2014 sigtest.sh
-rwxr-xr-x 1 root root 153 Dec 7 2014 sigtest.sh
-rwxr-xr-x 1 root root 3466 Jan 5 2015 zabbix.pm
-rwxr-xr-x 1 root root 3466 Jan 5 2015 zabbix.pm
There're about 38981 duplicate files like that.
The volume itself is a 3 x 2-replica:
# gluster volume info VOLNAME
Volume Name: VOLNAME
Type: Distributed-Replicate
Volume ID: 41f9096f-0d5f-4ea9-b369-89294cf1be99
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: gfserver1:/srv/BRICK
Brick2: gfserver2:/srv/BRICK
Brick3: gfserver3:/srv/BRICK
Brick4: gfserver4:/srv/BRICK
Brick5: gfserver5:/srv/BRICK
Brick6: gfserver6:/srv/BRICK
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
cluster.self-heal-daemon: enable
config.transport: tcp
The "duplicated" file on individual bricks:
[gfserver1]# ls -la /srv/BRICK/zabbix.pm
---------T 2 root root 0 Apr 23 2018 /srv/BRICK/zabbix.pm
[gfserver2]# ls -la /srv/BRICK/zabbix.pm
---------T 2 root root 0 Apr 23 2018 /srv/BRICK/zabbix.pm
[gfserver3]# ls -la /srv/BRICK/zabbix.pm
-rwxr-xr-x 2 root root 3466 Jan 5 2015 /srv/BRICK/zabbix.pm
[gfserver4]# ls -la /srv/BRICK/zabbix.pm
-rwxr-xr-x 2 root root 3466 Jan 5 2015 /srv/BRICK/zabbix.pm
[gfserver5]# ls -la /srv/BRICK/zabbix.pm
-rwxr-xr-x 2 root root 3466 Jan 5 2015 /srv/BRICK/zabbix.pm
[gfserver6]# ls -la /srv/BRICK/zabbix.pm
-rwxr-xr-x. 2 root root 3466 Jan 5 2015 /srv/BRICK/zabbix.pm
Attributes:
[gfserver1]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
# file: srv/BRICK/zabbix.pm
trusted.afr.VOLNAME-client-1=0x000000000000000000000000
trusted.afr.VOLNAME-client-4=0x000000000000000000000000
trusted.gfid=0x422a7ccf018242b58e162a65266326c3
trusted.glusterfs.dht.linkto=0x6678666565642d7265706c69636174652d3100
[gfserver2]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
# file: srv/BRICK/zabbix.pm
trusted.gfid=0x422a7ccf018242b58e162a65266326c3
trusted.gfid2path.3b27d24cad4dceef=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f7a61626269782e706d
trusted.glusterfs.dht.linkto=0x6678666565642d7265706c69636174652d3100
[gfserver3]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
# file: srv/BRICK/zabbix.pm
trusted.afr.VOLNAME-client-2=0x000000000000000000000000
trusted.afr.VOLNAME-client-3=0x000000000000000000000000
trusted.gfid=0x422a7ccf018242b58e162a65266326c3
[gfserver4]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
# file: srv/BRICK/zabbix.pm
trusted.gfid=0x422a7ccf018242b58e162a65266326c3
trusted.gfid2path.3b27d24cad4dceef=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f7a61626269782e706d
[gfserver5]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
# file: srv/BRICK/zabbix.pm
trusted.bit-rot.version=0x03000000000000005c4f813c000bc71b
trusted.gfid=0x422a7ccf018242b58e162a65266326c3
[gfserver6]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
# file: srv/BRICK/zabbix.pm
security.selinux=0x73797374656d5f753a6f626a6563745f723a7661725f743a733000
trusted.bit-rot.version=0x02000000000000005add0ffc000eb66a
trusted.gfid=0x422a7ccf018242b58e162a65266326c3
Not sure why exactly it happened... Maybe because some nodes were suddenly
upgraded from centos6's gluster ~3.7 to centos7's 4.1, and some files happened
to be on nodes that they're not supposed to be on.
Currently all the nodes are online:
# gluster pool list
UUID Hostname State
aac9e1a5-018f-4d27-9d77-804f0f1b2f13 gfserver5 Connected
98b22070-b579-4a91-86e3-482cfcc9c8cf gfserver3 Connected
7a9841a1-c63c-49f2-8d6d-a90ae2ff4e04 gfserver4 Connected
955f5551-8b42-476c-9eaa-feab35b71041 gfserver6 Connected
7343d655-3527-4bcf-9d13-55386ccb5f9c gfserver1 Connected
f9c79a56-830d-4056-b437-a669a1942626 gfserver2 Connected
45a72ab3-b91e-4076-9cf2-687669647217 localhost Connected
and have glusterfs-3.12.14-1.el6.x86_64 (Centos 6) and
glusterfs-4.1.7-1.el7.x86_64 (Centos 7) installed.
Expected result
---------------
This looks like a layout issue, so:
gluster volume rebalance VOLNAME fix-layout start
should fix it, right?
Actual result
-------------
I tried:
gluster volume rebalance VOLNAME fix-layout start
gluster volume rebalance VOLNAME start
gluster volume rebalance VOLNAME start force
gluster volume heal VOLNAME full
Those took 5 to 40 minutes to complete, but the duplicates are still there.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list