[Bugs] [Bug 1707866] New: Thousands of duplicate files in glusterfs mountpoint directory listing

Wed May 8 15:11:44 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1707866

            Bug ID: 1707866
           Summary: Thousands of duplicate files in glusterfs mountpoint
                    directory listing
           Product: GlusterFS
           Version: 4.1
          Hardware: x86_64
                OS: Linux
            Status: NEW
         Component: core
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: sergemp at mail.ru
                CC: bugs at gluster.org
  Target Milestone: ---
    Classification: Community

I have something impossible: same filenames are listed multiple times:

  # ls -la /mnt/VOLNAME/
  ...
  -rwxrwxr-x   1 root   root   3486 Jan 28  2016 check_connections.pl
  -rwxr-xr-x   1 root   root    153 Dec  7  2014 sigtest.sh
  -rwxr-xr-x   1 root   root    153 Dec  7  2014 sigtest.sh
  -rwxr-xr-x   1 root   root   3466 Jan  5  2015 zabbix.pm
  -rwxr-xr-x   1 root   root   3466 Jan  5  2015 zabbix.pm

There're about 38981 duplicate files like that.

The volume itself is a 3 x 2-replica:

  # gluster volume info VOLNAME
  Volume Name: VOLNAME
  Type: Distributed-Replicate
  Volume ID: 41f9096f-0d5f-4ea9-b369-89294cf1be99
  Status: Started
  Snapshot Count: 0
  Number of Bricks: 3 x 2 = 6
  Transport-type: tcp
  Bricks:
  Brick1: gfserver1:/srv/BRICK
  Brick2: gfserver2:/srv/BRICK
  Brick3: gfserver3:/srv/BRICK
  Brick4: gfserver4:/srv/BRICK
  Brick5: gfserver5:/srv/BRICK
  Brick6: gfserver6:/srv/BRICK
  Options Reconfigured:
  transport.address-family: inet
  nfs.disable: on
  cluster.self-heal-daemon: enable
  config.transport: tcp

The "duplicated" file on individual bricks:

  [gfserver1]# ls -la /srv/BRICK/zabbix.pm
  ---------T 2 root root 0 Apr 23  2018 /srv/BRICK/zabbix.pm

  [gfserver2]# ls -la /srv/BRICK/zabbix.pm
  ---------T 2 root root 0 Apr 23  2018 /srv/BRICK/zabbix.pm

  [gfserver3]# ls -la /srv/BRICK/zabbix.pm
  -rwxr-xr-x 2 root root 3466 Jan  5  2015 /srv/BRICK/zabbix.pm

  [gfserver4]# ls -la /srv/BRICK/zabbix.pm
  -rwxr-xr-x 2 root root 3466 Jan  5  2015 /srv/BRICK/zabbix.pm

  [gfserver5]# ls -la /srv/BRICK/zabbix.pm
  -rwxr-xr-x 2 root root 3466 Jan  5  2015 /srv/BRICK/zabbix.pm

  [gfserver6]# ls -la /srv/BRICK/zabbix.pm
  -rwxr-xr-x. 2 root root 3466 Jan  5  2015 /srv/BRICK/zabbix.pm

Attributes:

  [gfserver1]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
  # file: srv/BRICK/zabbix.pm
  trusted.afr.VOLNAME-client-1=0x000000000000000000000000
  trusted.afr.VOLNAME-client-4=0x000000000000000000000000
  trusted.gfid=0x422a7ccf018242b58e162a65266326c3
  trusted.glusterfs.dht.linkto=0x6678666565642d7265706c69636174652d3100

  [gfserver2]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
  # file: srv/BRICK/zabbix.pm
  trusted.gfid=0x422a7ccf018242b58e162a65266326c3

trusted.gfid2path.3b27d24cad4dceef=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f7a61626269782e706d
  trusted.glusterfs.dht.linkto=0x6678666565642d7265706c69636174652d3100

  [gfserver3]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
  # file: srv/BRICK/zabbix.pm
  trusted.afr.VOLNAME-client-2=0x000000000000000000000000
  trusted.afr.VOLNAME-client-3=0x000000000000000000000000
  trusted.gfid=0x422a7ccf018242b58e162a65266326c3

  [gfserver4]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
  # file: srv/BRICK/zabbix.pm
  trusted.gfid=0x422a7ccf018242b58e162a65266326c3

trusted.gfid2path.3b27d24cad4dceef=0x30303030303030302d303030302d303030302d303030302d3030303030303030303030312f7a61626269782e706d

  [gfserver5]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
  # file: srv/BRICK/zabbix.pm
  trusted.bit-rot.version=0x03000000000000005c4f813c000bc71b
  trusted.gfid=0x422a7ccf018242b58e162a65266326c3

  [gfserver6]# getfattr -m . -d -e hex /srv/BRICK/zabbix.pm
  # file: srv/BRICK/zabbix.pm
  security.selinux=0x73797374656d5f753a6f626a6563745f723a7661725f743a733000
  trusted.bit-rot.version=0x02000000000000005add0ffc000eb66a
  trusted.gfid=0x422a7ccf018242b58e162a65266326c3

Not sure why exactly it happened... Maybe because some nodes were suddenly
upgraded from centos6's gluster ~3.7 to centos7's 4.1, and some files happened
to be on nodes that they're not supposed to be on.

Currently all the nodes are online:

  # gluster pool list
  UUID                                  Hostname        State
  aac9e1a5-018f-4d27-9d77-804f0f1b2f13  gfserver5       Connected
  98b22070-b579-4a91-86e3-482cfcc9c8cf  gfserver3       Connected
  7a9841a1-c63c-49f2-8d6d-a90ae2ff4e04  gfserver4       Connected
  955f5551-8b42-476c-9eaa-feab35b71041  gfserver6       Connected
  7343d655-3527-4bcf-9d13-55386ccb5f9c  gfserver1       Connected
  f9c79a56-830d-4056-b437-a669a1942626  gfserver2       Connected
  45a72ab3-b91e-4076-9cf2-687669647217  localhost       Connected

and have glusterfs-3.12.14-1.el6.x86_64 (Centos 6) and
glusterfs-4.1.7-1.el7.x86_64 (Centos 7) installed.

Expected result
---------------

This looks like a layout issue, so:

  gluster volume rebalance VOLNAME fix-layout start

should fix it, right?

Actual result
-------------

I tried:
  gluster volume rebalance VOLNAME fix-layout start
  gluster volume rebalance VOLNAME start
  gluster volume rebalance VOLNAME start force
  gluster volume heal VOLNAME full
Those took 5 to 40 minutes to complete, but the duplicates are still there.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.