[Bugs] [Bug 1775512] New: Get errors"Could not read qcow2 header" when read qcow2 file with glusterfs

Fri Nov 22 06:58:12 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1775512

            Bug ID: 1775512
           Summary: Get errors"Could not read qcow2 header" when  read
                    qcow2 file with glusterfs
           Product: GlusterFS
           Version: mainline
          Hardware: x86_64
                OS: Linux
            Status: NEW
         Component: libgfapi
          Keywords: ZStream
          Severity: high
          Priority: high
          Assignee: bugs at gluster.org
          Reporter: skoduri at redhat.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org, dyuan at redhat.com, hhan at redhat.com,
                    h.moeller at hakimo.net, jgao at redhat.com,
                    jthottan at redhat.com, kdhananj at redhat.com,
                    lmen at redhat.com, madam at redhat.com,
                    moagrawa at redhat.com, pasik at iki.fi,
                    pkarampu at redhat.com, rgowdapp at redhat.com,
                    rhs-bugs at redhat.com, skoduri at redhat.com,
                    storage-qa-internal at redhat.com, vbellur at redhat.com,
                    vdas at redhat.com, xuzhang at redhat.com, yafu at redhat.com,
                    yalzhang at redhat.com, yisun at redhat.com
        Depends On: 1663431
  Target Milestone: ---
    Classification: Community

+++ This bug was initially created as a clone of Bug #1663431 +++

Description of problem:
Get errors"Could not read qcow2 header" when  read  qcow2 file in glusterfs

Version-Release number of selected component (if applicable):
gluster server:glusterfs-3.12.2-19.el7rhgs.x86_64
client:glusterfs-3.12.2-15.4.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Mount the gluster directory to local /mnt
# mount.glusterfs 10.66.4.119:/gv0  /mnt/

2.Create a new qcow2 file 
# qemu-img create -f qcow2 /mnt/qcow2mnt.img 10M

3.check it with qemu-img with gluster
[root at localhost ~]# qemu-img info gluster://10.66.4.119/gv0/qcow2mnt.img
qemu-img: Could not open 'gluster://10.66.4.119/gv0/qcow2mnt.img': Could not
read L1 table: Input/output error

Actual results:
As above

Expected results:
Can get the correct info of the qcow2 file.

Additional info:
1."raw" image is ok in this scenario.
2.qemu-img info /mnt/qcow2mnt.img works well

--- Additional comment from Red Hat Bugzilla Rules Engine on 2019-01-04
10:08:17 UTC ---

This bug is automatically being proposed for a Z-stream release of Red Hat
Gluster Storage 3 under active development and open for bug fixes, by setting
the release flag 'rhgs‑3.4.z' to '?'. 

If this bug should be proposed for a different release, please manually change
the proposed release flag.

--- Additional comment from  on 2019-02-01 10:09:43 UTC ---

hit the same issue:
(.libvirt-ci-venv-ci-runtest-sBLGCJ) [root at hp-dl380g9-02 virtual_disks]# rpm
-qa | grep gluster
libvirt-daemon-driver-storage-gluster-4.5.0-19.module+el8+2712+4c318da1.x86_64
glusterfs-client-xlators-3.12.2-32.1.el8.x86_64
qemu-kvm-block-gluster-2.12.0-59.module+el8+2714+6d9351dd.x86_64
glusterfs-cli-3.12.2-32.1.el8.x86_64
glusterfs-libs-3.12.2-32.1.el8.x86_64
glusterfs-api-3.12.2-32.1.el8.x86_64
glusterfs-3.12.2-32.1.el8.x86_64
glusterfs-fuse-3.12.2-32.1.el8.x86_64

(.libvirt-ci-venv-ci-runtest-sBLGCJ) [root at hp-dl380g9-02 virtual_disks]#
qemu-img info gluster://10.66.7.98/gluster-vol1/aaa.qcow2
qemu-img: Could not open 'gluster://10.66.7.98/gluster-vol1/aaa.qcow2': Could
not read L1 table: Input/output error

And this will block vm using the gluster disk, so escalate the priority
(.libvirt-ci-venv-ci-runtest-sBLGCJ) [root at hp-dl380g9-02 virtual_disks]# cat
gdisk
<disk device="disk" type="network"><driver cache="none" name="qemu"
type="qcow2" /><target bus="virtio" dev="vdb" /><source
name="gluster-vol1/aaa.qcow2" protocol="gluster"><host name="10.66.7.98"
port="24007" /></source></disk>

(.libvirt-ci-venv-ci-runtest-sBLGCJ) [root at hp-dl380g9-02 virtual_disks]# virsh
attach-device avocado-vt-vm1 gdisk
error: Failed to attach device from gdisk
error: internal error: unable to execute QEMU command 'device_add': Property
'virtio-blk-device.drive' can't find value 'drive-virtio-disk1'

--- Additional comment from Amar Tumballi on 2019-03-13 13:38:36 UTC ---

Moving bug to Krutika as she is more experienced in Virt workloads. 

Meantime, looking at glusterfs version, this is RHGS 3.4 builds.

--- Additional comment from Krutika Dhananjay on 2019-03-14 06:49:47 UTC ---

Could you share the following two pieces of information -

1. output of `gluster volume info $VOLNAME`
2. Are the glusterfs client and server running the same version of
gluster/RHGS?

-Krutika

--- Additional comment from Krutika Dhananjay on 2019-03-14 06:53:36 UTC ---

(In reply to Krutika Dhananjay from comment #4)
> Could you share the following two pieces of information -
> 
> 1. output of `gluster volume info $VOLNAME`
> 2. Are the glusterfs client and server running the same version of
> gluster/RHGS?

Let me clarify why I'm asking about the versions - the bug's "Description"
section says this -
gluster server:glusterfs-3.12.2-19.el7rhgs.x86_64
client:glusterfs-3.12.2-15.4.el8.x86_64"

but comment 2 lists the client package as
glusterfs-client-xlators-3.12.2-32.1.el8.x86_64

Want to be sure about the exact versions being used so I can recreate it.
(Looked at the logs, not much clue there)

-Krutika

> 
> -Krutika

--- Additional comment from gaojianan on 2019-03-15 01:51:43 UTC ---

(In reply to Krutika Dhananjay from comment #4)
> Could you share the following two pieces of information -
> 
> 1. output of `gluster volume info $VOLNAME`
> 2. Are the glusterfs client and server running the same version of
> gluster/RHGS?
> 
> -Krutika

1.`gluster volume info $VOLNAME`
[root at node1 ~]# gluster volume info gv1

Volume Name: gv1
Type: Distribute
Volume ID: de5d9272-e237-4a4e-8a30-a7c737f393db
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 10.66.4.119:/br2
Options Reconfigured:
nfs.disable: on
transport.address-family: inet

2.Server version:
[root at node1 ~]# rpm -qa |grep gluster
libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.3.x86_64
glusterfs-api-devel-3.12.2-19.el7rhgs.x86_64
pcp-pmda-gluster-4.1.0-4.el7.x86_64
glusterfs-3.12.2-19.el7rhgs.x86_64
python2-gluster-3.12.2-19.el7rhgs.x86_64
glusterfs-server-3.12.2-19.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-19.el7rhgs.x86_64
glusterfs-api-3.12.2-19.el7rhgs.x86_64
glusterfs-devel-3.12.2-19.el7rhgs.x86_64
glusterfs-debuginfo-3.12.2-18.el7.x86_64
glusterfs-libs-3.12.2-19.el7rhgs.x86_64
glusterfs-cli-3.12.2-19.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-19.el7rhgs.x86_64
glusterfs-fuse-3.12.2-19.el7rhgs.x86_64
glusterfs-rdma-3.12.2-19.el7rhgs.x86_64
glusterfs-events-3.12.2-19.el7rhgs.x86_64
samba-vfs-glusterfs-4.8.3-4.el7.x86_64

Client version:
[root at nssguest ~]# rpm -qa |grep gluster
qemu-kvm-block-gluster-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64
glusterfs-3.12.2-32.1.el8.x86_64
glusterfs-client-xlators-3.12.2-32.1.el8.x86_64
libvirt-daemon-driver-storage-gluster-5.0.0-6.virtcov.el8.x86_64
glusterfs-libs-3.12.2-32.1.el8.x86_64
glusterfs-cli-3.12.2-32.1.el8.x86_64
glusterfs-api-3.12.2-32.1.el8.x86_64

--- Additional comment from Krutika Dhananjay on 2019-03-18 05:59:49 UTC ---

(In reply to gaojianan from comment #6)
> (In reply to Krutika Dhananjay from comment #4)
> > Could you share the following two pieces of information -
> > 
> > 1. output of `gluster volume info $VOLNAME`
> > 2. Are the glusterfs client and server running the same version of
> > gluster/RHGS?
> > 
> > -Krutika
> 
> 1.`gluster volume info $VOLNAME`
> [root at node1 ~]# gluster volume info gv1
>  
> Volume Name: gv1
> Type: Distribute
> Volume ID: de5d9272-e237-4a4e-8a30-a7c737f393db
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1
> Transport-type: tcp
> Bricks:
> Brick1: 10.66.4.119:/br2
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
> 
> 
> 2.Server version:
> [root at node1 ~]# rpm -qa |grep gluster
> libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.3.x86_64
> glusterfs-api-devel-3.12.2-19.el7rhgs.x86_64
> pcp-pmda-gluster-4.1.0-4.el7.x86_64
> glusterfs-3.12.2-19.el7rhgs.x86_64
> python2-gluster-3.12.2-19.el7rhgs.x86_64
> glusterfs-server-3.12.2-19.el7rhgs.x86_64
> glusterfs-geo-replication-3.12.2-19.el7rhgs.x86_64
> glusterfs-api-3.12.2-19.el7rhgs.x86_64
> glusterfs-devel-3.12.2-19.el7rhgs.x86_64
> glusterfs-debuginfo-3.12.2-18.el7.x86_64
> glusterfs-libs-3.12.2-19.el7rhgs.x86_64
> glusterfs-cli-3.12.2-19.el7rhgs.x86_64
> glusterfs-client-xlators-3.12.2-19.el7rhgs.x86_64
> glusterfs-fuse-3.12.2-19.el7rhgs.x86_64
> glusterfs-rdma-3.12.2-19.el7rhgs.x86_64
> glusterfs-events-3.12.2-19.el7rhgs.x86_64
> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
> 
> Client version:
> [root at nssguest ~]# rpm -qa |grep gluster
> qemu-kvm-block-gluster-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64
> glusterfs-3.12.2-32.1.el8.x86_64
> glusterfs-client-xlators-3.12.2-32.1.el8.x86_64
> libvirt-daemon-driver-storage-gluster-5.0.0-6.virtcov.el8.x86_64
> glusterfs-libs-3.12.2-32.1.el8.x86_64
> glusterfs-cli-3.12.2-32.1.el8.x86_64
> glusterfs-api-3.12.2-32.1.el8.x86_64

Thanks.

I tried the same set of steps with the same versions of gluster client and
server and the test works for me everytime.
Perhaps the ONLY difference between your configuration and mine is that my
gluster-client is also on rhel7 unlike yours where you're running rhel8 on the
client machine. Also the qemu-img versions could be different.

Are you hitting this issue even with fuse mount, i.e., when you run `qemu-img
info` this way - `qemu-img info $FUSE_MOUNT_PATH/aaa.qcow2`?

If yes, could you run both `qemu-img create` and `qemu-img info` commands with
strace for a fresh file:

# strace -ff -T -v -o /tmp/qemu-img-create.out qemu-img create -f qcow2
$IMAGE_PATH 10M
# strace -ff -T -v -o /tmp/qemu-img-info.out info $IMAGE_PATH_OVER_FUSE_MOUNT

and share all of the resultant output files having format qemu-img-create.out*
and qemu-img-info.out*?

-Krutika

--- Additional comment from gaojianan on 2019-03-18 07:07:51 UTC ---

--- Additional comment from gaojianan on 2019-03-18 07:10:05 UTC ---

(In reply to Krutika Dhananjay from comment #7)
> (In reply to gaojianan from comment #6)
> > (In reply to Krutika Dhananjay from comment #4)
> > > Could you share the following two pieces of information -
> > > 
> > > 1. output of `gluster volume info $VOLNAME`
> > > 2. Are the glusterfs client and server running the same version of
> > > gluster/RHGS?
> > > 
> > > -Krutika
> > 
> > 1.`gluster volume info $VOLNAME`
> > [root at node1 ~]# gluster volume info gv1
> >  
> > Volume Name: gv1
> > Type: Distribute
> > Volume ID: de5d9272-e237-4a4e-8a30-a7c737f393db
> > Status: Started
> > Snapshot Count: 0
> > Number of Bricks: 1
> > Transport-type: tcp
> > Bricks:
> > Brick1: 10.66.4.119:/br2
> > Options Reconfigured:
> > nfs.disable: on
> > transport.address-family: inet
> > 
> > 
> > 2.Server version:
> > [root at node1 ~]# rpm -qa |grep gluster
> > libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.3.x86_64
> > glusterfs-api-devel-3.12.2-19.el7rhgs.x86_64
> > pcp-pmda-gluster-4.1.0-4.el7.x86_64
> > glusterfs-3.12.2-19.el7rhgs.x86_64
> > python2-gluster-3.12.2-19.el7rhgs.x86_64
> > glusterfs-server-3.12.2-19.el7rhgs.x86_64
> > glusterfs-geo-replication-3.12.2-19.el7rhgs.x86_64
> > glusterfs-api-3.12.2-19.el7rhgs.x86_64
> > glusterfs-devel-3.12.2-19.el7rhgs.x86_64
> > glusterfs-debuginfo-3.12.2-18.el7.x86_64
> > glusterfs-libs-3.12.2-19.el7rhgs.x86_64
> > glusterfs-cli-3.12.2-19.el7rhgs.x86_64
> > glusterfs-client-xlators-3.12.2-19.el7rhgs.x86_64
> > glusterfs-fuse-3.12.2-19.el7rhgs.x86_64
> > glusterfs-rdma-3.12.2-19.el7rhgs.x86_64
> > glusterfs-events-3.12.2-19.el7rhgs.x86_64
> > samba-vfs-glusterfs-4.8.3-4.el7.x86_64
> > 
> > Client version:
> > [root at nssguest ~]# rpm -qa |grep gluster
> > qemu-kvm-block-gluster-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64
> > glusterfs-3.12.2-32.1.el8.x86_64
> > glusterfs-client-xlators-3.12.2-32.1.el8.x86_64
> > libvirt-daemon-driver-storage-gluster-5.0.0-6.virtcov.el8.x86_64
> > glusterfs-libs-3.12.2-32.1.el8.x86_64
> > glusterfs-cli-3.12.2-32.1.el8.x86_64
> > glusterfs-api-3.12.2-32.1.el8.x86_64
> 
> Thanks.
> 
> I tried the same set of steps with the same versions of gluster client and
> server and the test works for me everytime.
> Perhaps the ONLY difference between your configuration and mine is that my
> gluster-client is also on rhel7 unlike yours where you're running rhel8 on
> the client machine. Also the qemu-img versions could be different.
> 
> Are you hitting this issue even with fuse mount, i.e., when you run
> `qemu-img info` this way - `qemu-img info $FUSE_MOUNT_PATH/aaa.qcow2`?
> 
> If yes, could you run both `qemu-img create` and `qemu-img info` commands
> with strace for a fresh file:
> 
> # strace -ff -T -v -o /tmp/qemu-img-create.out qemu-img create -f qcow2
> $IMAGE_PATH 10M
> # strace -ff -T -v -o /tmp/qemu-img-info.out info $IMAGE_PATH_OVER_FUSE_MOUNT
> 
> 
> and share all of the resultant output files having format
> qemu-img-create.out* and qemu-img-info.out*?
> 
> -Krutika

I think this bug only happens when we create a file on the mounted path and
check it with `qemu-img info gluster://$ip/filename` ,and `qemu-img info
$FUSE_MOUNT_PATH/filename ` works well.

--- Additional comment from Krutika Dhananjay on 2019-03-20 05:20:47 UTC ---

OK, I took a look at the traces. Unfortunately in the libgfapi-access case, we
need ltrace output instead of strace since all calls are made in the userspace.
I did test ltrace command before sharing it with you just to be sure it works.
But i see that the arguments to the library calls are not printed as symbols.

Since you're seeing this issue only with gfapi, I'm passing this issue over to
gfapi experts for a faster resolution.

Poornima/Soumya/Jiffin,

Could one of you help?

-Krutika

--- Additional comment from Soumya Koduri on 2019-03-20 17:38:10 UTC ---

To start with, getting the logs exclusive to gfapi access and tcpdump while the
below command is ran would be helpful -

qemu-img info gluster://$ip/filename

--- Additional comment from Krutika Dhananjay on 2019-03-21 05:45:15 UTC ---

Setting needinfo on the reporter to get the info requested in comment 11.

--- Additional comment from gaojianan on 2019-03-22 06:55:49 UTC ---

--- Additional comment from Yaniv Kaul on 2019-04-22 07:19:24 UTC ---

Status?

--- Additional comment from PnT Account Manager on 2019-11-04 22:30:24 UTC ---

Employee 'pgurusid at redhat.com' has left the company.

--- Additional comment from Mohit Agrawal on 2019-11-19 13:52:39 UTC ---

@Soumya

Did you get a chance to analyze the logs and tcpdump?

Thanks,
Mohit Agrawal

--- Additional comment from Soumya Koduri on 2019-11-20 18:01:32 UTC ---

(In reply to Mohit Agrawal from comment #16)
> @Soumya
> 
> Did you get a chance to analyze the logs and tcpdump?
> 
> Thanks,
> Mohit Agrawal

Hi,

I just looked at the files uploaded. The tcpdump doesnt have gluster traffic
captured. Please ensure if the command was issued on the right machine (where
the command is being executed) and verify the filters (for the right interface
and IP etc)

>From the logs, I see there is a failure for SEEK() fop -

[2019-03-22 06:47:34.557047] T [MSGID: 0] [dht-hashfn.c:94:dht_hash_compute]
0-gv1-dht: trying regex for test.img
[2019-03-22 06:47:34.557059] D [MSGID: 0] [dht-common.c:3675:dht_lookup]
0-gv1-dht: Calling fresh lookup for /test.img on gv1-client-0
[2019-03-22 06:47:34.557067] T [MSGID: 0] [dht-common.c:3679:dht_lookup]
0-stack-trace: stack-address: 0x55ce03dd1720, winding from gv1-dht to
gv1-client-0
[2019-03-22 06:47:34.557079] T [rpc-clnt.c:1496:rpc_clnt_record]
0-gv1-client-0: Auth Info: pid: 10233, uid: 0, gid: 0, owner: 
[2019-03-22 06:47:34.557086] T [rpc-clnt.c:1353:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 420, payload: 348, rpc hdr: 72
[2019-03-22 06:47:34.557110] T [rpc-clnt.c:1699:rpc_clnt_submit] 0-rpc-clnt:
submitted request (XID: 0xb Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to
rpc-transport (gv1-client-0)
[2019-03-22 06:47:34.557513] T [rpc-clnt.c:675:rpc_clnt_reply_init]
0-gv1-client-0: received rpc message (RPC XID: 0xb Program: GlusterFS 3.3,
ProgVers: 330, Proc: 27) from rpc-transport (gv1-client-0)
[2019-03-22 06:47:34.557536] T [MSGID: 0]
[client-rpc-fops.c:2873:client3_3_lookup_cbk] 0-stack-trace: stack-address:
0x55ce03dd1720, gv1-client-0 returned 0
[2019-03-22 06:47:34.557549] D [MSGID: 0] [dht-common.c:3228:dht_lookup_cbk]
0-gv1-dht: fresh_lookup returned for /test.img with op_ret 0

>> LOOKUP on  /test.img was successful

[2019-03-22 06:47:34.563416] T [MSGID: 0] [defaults.c:2927:default_seek]
0-stack-trace: stack-address: 0x55ce03dd1720, winding from gv1-read-ahead to
gv1-write-behind
[2019-03-22 06:47:34.563424] T [MSGID: 0] [defaults.c:2927:default_seek]
0-stack-trace: stack-address: 0x55ce03dd1720, winding from gv1-write-behind to
gv1-dht
[2019-03-22 06:47:34.563432] T [MSGID: 0] [defaults.c:2927:default_seek]
0-stack-trace: stack-address: 0x55ce03dd1720, winding from gv1-dht to
gv1-client-0
[2019-03-22 06:47:34.563443] T [rpc-clnt.c:1496:rpc_clnt_record]
0-gv1-client-0: Auth Info: pid: 10233, uid: 0, gid: 0, owner: 
[2019-03-22 06:47:34.563451] T [rpc-clnt.c:1353:rpc_clnt_record_build_header]
0-rpc-clnt: Request fraglen 112, payload: 40, rpc hdr: 72
[2019-03-22 06:47:34.563478] T [rpc-clnt.c:1699:rpc_clnt_submit] 0-rpc-clnt:
submitted request (XID: 0xc Program: GlusterFS 3.3, ProgVers: 330, Proc: 48) to
rpc-transport (gv1-client-0)
[2019-03-22 06:47:34.563990] T [rpc-clnt.c:675:rpc_clnt_reply_init]
0-gv1-client-0: received rpc message (RPC XID: 0xc Program: GlusterFS 3.3,
ProgVers: 330, Proc: 48) from rpc-transport (gv1-client-0)
[2019-03-22 06:47:34.564008] W [MSGID: 114031]
[client-rpc-fops.c:2156:client3_3_seek_cbk] 0-gv1-client-0: remote operation
failed [No such device or address]
[2019-03-22 06:47:34.564028] D [MSGID: 0]
[client-rpc-fops.c:2160:client3_3_seek_cbk] 0-stack-trace: stack-address:
0x55ce03dd1720, gv1-client-0 returned -1 error: No such device or address [No
such device or address]
[2019-03-22 06:47:34.564041] D [MSGID: 0] [defaults.c:1531:default_seek_cbk]
0-stack-trace: stack-address: 0x55ce03dd1720, gv1-io-threads returned -1 error:
No such device or address [No such device or address]
[2019-03-22 06:47:34.564051] D [MSGID: 0] [io-stats.c:2548:io_stats_seek_cbk]
0-stack-trace: stack-address: 0x55ce03dd1720, gv1 returned -1 error: No such
device or address [No such device or address]

client3_seek_cbk() received '-1'. We may first need to check why the fop was
failed by server. If its reproducible, it should be fairly easy to check.

--- Additional comment from Mohit Agrawal on 2019-11-21 02:53:42 UTC ---

@gaojianan
Can you share the data asked by Soumya and share the brick logs along with
data(client-logs and tcpdump)?

--- Additional comment from gaojianan on 2019-11-21 06:55:31 UTC ---

(In reply to Mohit Agrawal from comment #18)
> @gaojianan
> Can you share the data asked by Soumya and share the brick logs along with
> data(client-logs and tcpdump)?
client version:
glusterfs-client-xlators-6.0-20.el8.x86_64
glusterfs-libs-6.0-20.el8.x86_64
qemu-kvm-block-gluster-4.1.0-13.module+el8.1.0+4313+ef76ec61.x86_64
glusterfs-fuse-6.0-20.el8.x86_64
libvirt-daemon-driver-storage-gluster-5.6.0-7.module+el8.1.1+4483+2f45aaa2.x86_64
glusterfs-api-6.0-20.el8.x86_64
glusterfs-cli-6.0-20.el8.x86_64
glusterfs-6.0-20.el8.x86_64

Try again with the step as comment1.
Steps to Reproduce:
1.Mount the gluster directory to local /tmp/gluster
# mount.glusterfs 10.66.85.243:/jgao-vol1 /tmp/gluster

2.Create a new qcow2 file 
# qemu-img create -f qcow2 /tmp/gluster/test.img 100M

3.check it with qemu-img with gluster
[root at localhost ~]# qemu-img info gluster://10.66.85.243/jgao-vol1/test.img
qemu-img: Could not open 'gluster://10.66.85.243/jgao-vol1/test.img': Could not
read L1 table: Input/output error

More detail info in the attachment
If any other question,you can needinfo me again.

--- Additional comment from Han Han on 2019-11-21 07:11:05 UTC ---

(In reply to gaojianan from comment #19)
> Created attachment 1638327 [details]
> tcpdump log and gfapi log of the client
the tcpdump file contains too much other protocol data. It is better to use
filter to get only glusterfs related network traffic.

BTW, I have a questions, what ports are used in gluserfs by default for
gluster-server-6.0.x ?
24007-24009? 49152?

> 
> (In reply to Mohit Agrawal from comment #18)
> > @gaojianan
> > Can you share the data asked by Soumya and share the brick logs along with
> > data(client-logs and tcpdump)?
> client version:
> glusterfs-client-xlators-6.0-20.el8.x86_64
> glusterfs-libs-6.0-20.el8.x86_64
> qemu-kvm-block-gluster-4.1.0-13.module+el8.1.0+4313+ef76ec61.x86_64
> glusterfs-fuse-6.0-20.el8.x86_64
> libvirt-daemon-driver-storage-gluster-5.6.0-7.module+el8.1.1+4483+2f45aaa2.
> x86_64
> glusterfs-api-6.0-20.el8.x86_64
> glusterfs-cli-6.0-20.el8.x86_64
> glusterfs-6.0-20.el8.x86_64
> 
> 
> 
> Try again with the step as comment1.
> Steps to Reproduce:
> 1.Mount the gluster directory to local /tmp/gluster
> # mount.glusterfs 10.66.85.243:/jgao-vol1 /tmp/gluster
> 
> 2.Create a new qcow2 file 
> # qemu-img create -f qcow2 /tmp/gluster/test.img 100M
> 
> 3.check it with qemu-img with gluster
> [root at localhost ~]# qemu-img info gluster://10.66.85.243/jgao-vol1/test.img
> qemu-img: Could not open 'gluster://10.66.85.243/jgao-vol1/test.img': Could
> not read L1 table: Input/output error
> 
> More detail info in the attachment
> If any other question,you can needinfo me again.

--- Additional comment from Han Han on 2019-11-21 07:14:37 UTC ---

What's more, please update brick logs as comment18 said. That log is located in
/var/log/glusterfs/bricks/ on glusterfs server.

--- Additional comment from gaojianan on 2019-11-21 07:57:53 UTC ---

In the bricks log,the "gluster-vol1" is the same as "jgao-vol1" in other two
files because i destroyed my env and setup again.

--- Additional comment from Mohit Agrawal on 2019-11-22 04:45:08 UTC ---

@Soumya

Please check the latest logs and tcpdump?

Thanks,
Mohit Agrawal

--- Additional comment from Soumya Koduri on 2019-11-22 06:57:08 UTC ---

>From the latest debug.log provided, I see this error -

[2019-11-21 06:34:15.127610] D [MSGID: 0]
[client-helpers.c:427:client_get_remote_fd] 0-jgao-vol1-client-0: not a valid
fd for gfid: 59ca8bf2-f75a-427f-857e-98843a85dbac [Bad file descriptor]
[2019-11-21 06:34:15.127620] W [MSGID: 114061]
[client-common.c:1288:client_pre_seek] 0-jgao-vol1-client-0: 
(59ca8bf2-f75a-427f-857e-98843a85dbac) remote_fd is -1. EBADFD [File descriptor
in bad state]
[2019-11-21 06:34:15.127628] D [MSGID: 0]
[client-rpc-fops.c:5949:client3_3_seek] 0-stack-trace: stack-address:
0x5625eed41b08, jgao-vol1-client-0 returned -1 error: File descriptor in bad
state [File descriptor in bad state]
[2019-11-21 06:34:15.127636] D [MSGID: 0] [defaults.c:1617:default_seek_cbk]
0-stack-trace: stack-address: 0x5625eed41b08, jgao-vol1-io-threads returned -1
error: File descriptor in bad state [File descriptor in bad state]

client3_seek fop got EBADFD error. The fd used in the flag may have got flushed
and no more valid. On further code-reading found that there is a bug in
glfs_seek() fop. There is a missing ref on glfd which may have led to this
issue. I will send patch to fix that.

But however I am unable to reproduce this issue to test it. On my system the
test always pass -

[root at dhcp35-198 ~]# qemu-img create -f qcow2 /fuse-mnt/test.img 100M
Formatting '/fuse-mnt/test.img', fmt=qcow2 size=104857600 encryption=off
cluster_size=65536 lazy_refcounts=off refcount_bits=16
[root at dhcp35-198 ~]# 
[root at dhcp35-198 ~]# 
[root at dhcp35-198 ~]# qemu-img info gluster://localhost/rep_vol/test.img
[2019-11-22 06:36:43.703941] E [MSGID: 108006]
[afr-common.c:5322:__afr_handle_child_down_event] 0-rep_vol-replicate-0: All
subvolumes are down. Going offline until at least one of them comes back up.
[2019-11-22 06:36:43.705035] I [io-stats.c:4027:fini] 0-rep_vol: io-stats
translator unloaded
image: gluster://localhost/rep_vol/test.img
file format: qcow2
virtual size: 100M (104857600 bytes)
disk size: 193K
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
[root at dhcp35-198 ~]# 

I am using latest master branch of gluster. I shall post the fix for the bug in
glfs_seek mentioned above. But if someone could test it, that shall be helpful.

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1663431
[Bug 1663431] Get errors"Could not read qcow2 header" when  read  qcow2 file
with glusterfs
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.