[Bugs] [Bug 1652548] New: Error reading some files - Stale file handle - distribute 2 - replica 3 volume with sharding

bugzilla at redhat.com bugzilla at redhat.com
Thu Nov 22 10:52:38 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1652548

            Bug ID: 1652548
           Summary: Error reading some files - Stale file handle -
                    distribute 2 - replica 3 volume with sharding
           Product: GlusterFS
           Version: 3.12
         Component: sharding
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: marcoc at prismatelecomtesting.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org



Description of problem:
Error reading some files.
I'm trying to export a vm from gluster volume because oVirt pause the VM
because storage error but it's not possible due to "Stale file handle" errors.

I mounted the volume on another server:
s23gfs.ovirt:VOL_VMDATA on /mnt/VOL_VMDATA type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,allow_other,max_read=131072)

Trying to read the file with cp, rsync or qemu-img converter has the same
result:

# qemu-img convert -p -t none -T none -f qcow2
/mnt/VOL_VMDATA/d4f82517-5ce0-4705-a89f-5d3c81adf764/images/dbb038ee-2794-40e8-877a-a4806c47f11f/f81e0be9-db3e-48ac-876f-57b6f7cb3fe8
-O raw PLONE_active-raw.img
qemu-img: error while reading sector 2448441344: Stale file handle


Version-Release number of selected component (if applicable):
Gluster 3.12.15-1.el7

In mount log file I got many errors like:
[2018-11-20 03:20:24.471344] E [MSGID: 133010]
[shard.c:1724:shard_common_lookup_shards_cbk] 0-VOL_VMDATA-shard: Lookup on
shard 3558 failed. Base file gfid = 4feb4a7e-e1a3-4fa3-8d38-3b929bf52d14 [Stale
file handle]
[2018-11-20 08:56:21.110258] E [MSGID: 133010]
[shard.c:1724:shard_common_lookup_shards_cbk] 0-VOL_VMDATA-shard: Lookup on
shard 541 failed. Base file gfid = 2c1b6402-87b0-45cd-bd81-2cd3f38dd530 [Stale
file handle]


Is there a way to fix this? It's a distributed 2 - replicate 3 volume with
sharding.

Thanks,
Marco


Additional info:
# gluster volume info VOL_VMDATA

Volume Name: VOL_VMDATA
Type: Distributed-Replicate
Volume ID: 7bd4e050-47dd-481e-8862-cd6b76badddc
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: s20gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Brick2: s21gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Brick3: s22gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Brick4: s23gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Brick5: s24gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Brick6: s25gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Options Reconfigured:
auth.allow: 192.168.50.*,172.16.4.*,192.168.56.203
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: enable
features.shard-block-size: 512MB
cluster.data-self-heal-algorithm: full
nfs.disable: on
transport.address-family: inet


# gluster volume heal VOL_VMDATA info
Brick s20gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Status: Connected
Number of entries: 0

Brick s21gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Status: Connected
Number of entries: 0

Brick s22gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Status: Connected
Number of entries: 0

Brick s23gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Status: Connected
Number of entries: 0

Brick s24gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Status: Connected
Number of entries: 0

Brick s25gfs.ovirt.prisma:/gluster/VOL_VMDATA/brick
Status: Connected
Number of entries: 0


# gluster volume status VOL_VMDATA
Status of volume: VOL_VMDATA
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick s20gfs.ovirt.prisma:/gluster/VOL_VMDA
TA/brick                                    49153     0          Y       3186 
Brick s21gfs.ovirt.prisma:/gluster/VOL_VMDA
TA/brick                                    49153     0          Y       5148 
Brick s22gfs.ovirt.prisma:/gluster/VOL_VMDA
TA/brick                                    49153     0          Y       3792 
Brick s23gfs.ovirt.prisma:/gluster/VOL_VMDA
TA/brick                                    49153     0          Y       3257 
Brick s24gfs.ovirt.prisma:/gluster/VOL_VMDA
TA/brick                                    49153     0          Y       4402 
Brick s25gfs.ovirt.prisma:/gluster/VOL_VMDA
TA/brick                                    49153     0          Y       3231 
Self-heal Daemon on localhost               N/A       N/A        Y       4192 
Self-heal Daemon on s25gfs.ovirt.prisma     N/A       N/A        Y       63185
Self-heal Daemon on s24gfs.ovirt.prisma     N/A       N/A        Y       39535
Self-heal Daemon on s20gfs.ovirt.prisma     N/A       N/A        Y       2785 
Self-heal Daemon on s23gfs.ovirt.prisma     N/A       N/A        Y       765  
Self-heal Daemon on s22.ovirt.prisma        N/A       N/A        Y       5828 

Task Status of Volume VOL_VMDATA
------------------------------------------------------------------------------
There are no active volume tasks

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list