[Gluster-users] [Possibile SPAM] Re: Problem with Gluster 3.12.4, VM and sharding
    Krutika Dhananjay 
    kdhananj at redhat.com
       
    Thu Jan 18 12:30:06 UTC 2018
    
    
  
Thanks for that input. Adding Niels since the issue is reproducible only
with libgfapi.
-Krutika
On Thu, Jan 18, 2018 at 1:39 PM, Ing. Luca Lazzeroni - Trend Servizi Srl <
luca at gvnet.it> wrote:
> Another update.
>
> I've setup a replica 3 volume without sharding and tried to install a VM
> on a qcow2 volume on that device; however the result is the same and the vm
> image has been corrupted, exactly at the same point.
>
> Here's the volume info of the create volume:
>
> Volume Name: gvtest
> Type: Replicate
> Volume ID: e2ddf694-ba46-4bc7-bc9c-e30803374e9d
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster1:/bricks/brick1/gvtest
> Brick2: gluster2:/bricks/brick1/gvtest
> Brick3: gluster3:/bricks/brick1/gvtest
> Options Reconfigured:
> user.cifs: off
> features.shard: off
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: enable
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
>
> Il 17/01/2018 14:51, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:
>
> Hi,
>
> after our IRC chat I've rebuilt a virtual machine with FUSE based virtual
> disk. Everything worked flawlessly.
>
> Now I'm sending you the output of the requested getfattr command on the
> disk image:
>
> # file: TestFUSE-vda.qcow2
> trusted.afr.dirty=0x000000000000000000000000
> trusted.gfid=0x40ffafbbe987445692bb31295fa40105
> trusted.gfid2path.dc9dde61f0b77eab=0x31326533323631662d373839332d
> 346262302d383738632d3966623765306232336263652f54657374465553
> 452d7664612e71636f7732
> trusted.glusterfs.shard.block-size=0x0000000004000000
> trusted.glusterfs.shard.file-size=0x00000000c1530000000000000000
> 0000000000000060be900000000000000000
>
> Hope this helps.
>
>
>
> Il 17/01/2018 11:37, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:
>
> I actually use FUSE and it works. If i try to use "libgfapi" direct
> interface to gluster in qemu-kvm, the problem appears.
>
>
>
> Il 17/01/2018 11:35, Krutika Dhananjay ha scritto:
>
> Really? Then which protocol exactly do you see this issue with? libgfapi?
> NFS?
>
> -Krutika
>
> On Wed, Jan 17, 2018 at 3:59 PM, Ing. Luca Lazzeroni - Trend Servizi Srl <
> luca at gvnet.it> wrote:
>
>> Of course. Here's the full log. Please, note that in FUSE mode everything
>> works apparently without problems. I've installed 4 vm and updated them
>> without problems.
>>
>>
>>
>> Il 17/01/2018 11:00, Krutika Dhananjay ha scritto:
>>
>>
>>
>> On Tue, Jan 16, 2018 at 10:47 PM, Ing. Luca Lazzeroni - Trend Servizi Srl
>> <luca at gvnet.it> wrote:
>>
>>> I've made the test with raw image format (preallocated too) and the
>>> corruption problem is still there (but without errors in bricks' log file).
>>>
>>> What does the "link" error in bricks log files means ?
>>>
>>> I've seen the source code looking for the lines where it happens and it
>>> seems a warning (it doesn't imply a failure).
>>>
>>
>> Indeed, it only represents a transient state when the shards are created
>> for the first time and does not indicate a failure.
>> Could you also get the logs of the gluster fuse mount process? It should
>> be under /var/log/glusterfs of your client machine with the filename as a
>> hyphenated mount point path.
>>
>> For example, if your volume was mounted at /mnt/glusterfs, then your log
>> file would be named mnt-glusterfs.log.
>>
>> -Krutika
>>
>>
>>>
>>> Il 16/01/2018 17:39, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:
>>>
>>> An update:
>>>
>>> I've tried, for my tests, to create the vm volume as
>>>
>>> qemu-img create -f qcow2 -o preallocation=full
>>> gluster://gluster1/Test/Test-vda.img 20G
>>>
>>> et voila !
>>>
>>> No errors at all, neither in bricks' log file (the "link failed" message
>>> disappeared), neither in VM (no corruption and installed succesfully).
>>>
>>> I'll do another test with a fully preallocated raw image.
>>>
>>>
>>>
>>> Il 16/01/2018 16:31, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:
>>>
>>> I've just done all the steps to reproduce the problem.
>>>
>>> Tha VM volume has been created via "qemu-img create -f qcow2
>>> Test-vda2.qcow2 20G" on the gluster volume mounted via FUSE. I've tried
>>> also to create the volume with preallocated metadata, which moves the
>>> problem a bit far away (in time). The volume is a replice 3 arbiter 1
>>> volume hosted on XFS bricks.
>>>
>>> Here are the informations:
>>>
>>> [root at ovh-ov1 bricks]# gluster volume info gv2a2
>>>
>>> Volume Name: gv2a2
>>> Type: Replicate
>>> Volume ID: 83c84774-2068-4bfc-b0b9-3e6b93705b9f
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: gluster1:/bricks/brick2/gv2a2
>>> Brick2: gluster3:/bricks/brick3/gv2a2
>>> Brick3: gluster2:/bricks/arbiter_brick_gv2a2/gv2a2 (arbiter)
>>> Options Reconfigured:
>>> storage.owner-gid: 107
>>> storage.owner-uid: 107
>>> user.cifs: off
>>> features.shard: on
>>> cluster.shd-wait-qlength: 10000
>>> cluster.shd-max-threads: 8
>>> cluster.locking-scheme: granular
>>> cluster.data-self-heal-algorithm: full
>>> cluster.server-quorum-type: server
>>> cluster.quorum-type: auto
>>> cluster.eager-lock: enable
>>> network.remote-dio: enable
>>> performance.low-prio-threads: 32
>>> performance.io-cache: off
>>> performance.read-ahead: off
>>> performance.quick-read: off
>>> transport.address-family: inet
>>> nfs.disable: off
>>> performance.client-io-threads: off
>>>
>>> /var/log/glusterfs/glusterd.log:
>>>
>>> [2018-01-15 14:17:50.196228] I [MSGID: 106488]
>>> [glusterd-handler.c:1548:__glusterd_handle_cli_get_volume]
>>> 0-management: Received get vol req
>>> [2018-01-15 14:25:09.555214] I [MSGID: 106488]
>>> [glusterd-handler.c:1548:__glusterd_handle_cli_get_volume]
>>> 0-management: Received get vol req
>>>
>>> (empty because today it's 2018-01-16)
>>>
>>> /var/log/glusterfs/glustershd.log:
>>>
>>> [2018-01-14 02:23:02.731245] I [glusterfsd-mgmt.c:1821:mgmt_getspec_cbk]
>>> 0-glusterfs: No change in volfile,continuing
>>>
>>> (empty too)
>>>
>>> /var/log/glusterfs/bricks/brick-brick2-gv2a2.log (the interested
>>> volume):
>>>
>>> [2018-01-16 15:14:37.809965] I [MSGID: 115029]
>>> [server-handshake.c:793:server_setvolume] 0-gv2a2-server: accepted
>>> client from ovh-ov1-10302-2018/01/16-15:14:37:790306-gv2a2-client-0-0-0
>>> (version: 3.12.4)
>>> [2018-01-16 15:16:41.471751] E [MSGID: 113020]
>>> [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4
>>> failed
>>> [2018-01-16 15:16:41.471745] W [MSGID: 113096]
>>> [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4 ->
>>> /bricks/brick2/gv2a2/.glusterfs/a0/14/a0144df3-8d89-4aed-872e-5fef141e9e1efailed
>>> [File exists]
>>> [2018-01-16 15:16:42.593392] W [MSGID: 113096]
>>> [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5 ->
>>> /bricks/brick2/gv2a2/.glusterfs/eb/04/eb044e6e-3a23-40a4-9ce1-f13af148eb67failed
>>> [File exists]
>>> [2018-01-16 15:16:42.593426] E [MSGID: 113020]
>>> [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5
>>> failed
>>> [2018-01-16 15:17:04.129593] W [MSGID: 113096]
>>> [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8 ->
>>> /bricks/brick2/gv2a2/.glusterfs/dc/92/dc92bd0a-0d46-4826-a4c9-d073a924dd8dfailed
>>> [File exists]
>>> The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard]
>>> 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62
>>> 335cb9-c7b5-4735-a879-59cff93fe622.8 -> /bricks/brick2/gv2a2/.glusterf
>>> s/dc/92/dc92bd0a-0d46-4826-a4c9-d073a924dd8dfailed  [File exists]"
>>> repeated 5 times between [2018-01-16 15:17:04.129593] and [2018-01-16
>>> 15:17:04.129593]
>>> [2018-01-16 15:17:04.129661] E [MSGID: 113020]
>>> [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8
>>> failed
>>> [2018-01-16 15:17:08.279162] W [MSGID: 113096]
>>> [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9 ->
>>> /bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed
>>> [File exists]
>>> [2018-01-16 15:17:08.279162] W [MSGID: 113096]
>>> [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9 ->
>>> /bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed
>>> [File exists]
>>> The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard]
>>> 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62
>>> 335cb9-c7b5-4735-a879-59cff93fe622.9 -> /bricks/brick2/gv2a2/.glusterf
>>> s/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed  [File exists]"
>>> repeated 2 times between [2018-01-16 15:17:08.279162] and [2018-01-16
>>> 15:17:08.279162]
>>>
>>> [2018-01-16 15:17:08.279177] E [MSGID: 113020]
>>> [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9
>>> failed
>>> The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard]
>>> 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62
>>> 335cb9-c7b5-4735-a879-59cff93fe622.4 -> /bricks/brick2/gv2a2/.glusterf
>>> s/a0/14/a0144df3-8d89-4aed-872e-5fef141e9e1efailed  [File exists]"
>>> repeated 6 times between [2018-01-16 15:16:41.471745] and [2018-01-16
>>> 15:16:41.471807]
>>> The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard]
>>> 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62
>>> 335cb9-c7b5-4735-a879-59cff93fe622.5 -> /bricks/brick2/gv2a2/.glusterf
>>> s/eb/04/eb044e6e-3a23-40a4-9ce1-f13af148eb67failed  [File exists]"
>>> repeated 2 times between [2018-01-16 15:16:42.593392] and [2018-01-16
>>> 15:16:42.593430]
>>> [2018-01-16 15:17:32.229689] W [MSGID: 113096]
>>> [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14 ->
>>> /bricks/brick2/gv2a2/.glusterfs/53/04/530449fa-d698-4928-a262-9a0234232323failed
>>> [File exists]
>>> [2018-01-16 15:17:32.229720] E [MSGID: 113020]
>>> [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14
>>> failed
>>> [2018-01-16 15:18:07.154330] W [MSGID: 113096]
>>> [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17 ->
>>> /bricks/brick2/gv2a2/.glusterfs/81/96/8196dd19-84bc-4c3d-909f-8792e9b4929dfailed
>>> [File exists]
>>> [2018-01-16 15:18:07.154375] E [MSGID: 113020]
>>> [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17
>>> failed
>>> The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard]
>>> 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62
>>> 335cb9-c7b5-4735-a879-59cff93fe622.14 -> /bricks/brick2/gv2a2/.glusterf
>>> s/53/04/530449fa-d698-4928-a262-9a0234232323failed  [File exists]"
>>> repeated 7 times between [2018-01-16 15:17:32.229689] and [2018-01-16
>>> 15:17:32.229806]
>>> The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard]
>>> 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62
>>> 335cb9-c7b5-4735-a879-59cff93fe622.17 -> /bricks/brick2/gv2a2/.glusterf
>>> s/81/96/8196dd19-84bc-4c3d-909f-8792e9b4929dfailed  [File exists]"
>>> repeated 3 times between [2018-01-16 15:18:07.154330] and [2018-01-16
>>> 15:18:07.154357]
>>> [2018-01-16 15:19:23.618794] W [MSGID: 113096]
>>> [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21 ->
>>> /bricks/brick2/gv2a2/.glusterfs/6d/02/6d02bd98-83de-43e8-a7af-b1d5f5160403failed
>>> [File exists]
>>> [2018-01-16 15:19:23.618827] E [MSGID: 113020]
>>> [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on
>>> /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21
>>> failed
>>> The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard]
>>> 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62
>>> 335cb9-c7b5-4735-a879-59cff93fe622.21 -> /bricks/brick2/gv2a2/.glusterf
>>> s/6d/02/6d02bd98-83de-43e8-a7af-b1d5f5160403failed  [File exists]"
>>> repeated 3 times between [2018-01-16 15:19:23.618794] and [2018-01-16
>>> 15:19:23.618794]
>>>
>>> Thank you,
>>>
>>> Il 16/01/2018 11:40, Krutika Dhananjay ha scritto:
>>>
>>> Also to help isolate the component, could you answer these:
>>>
>>> 1. on a different volume with shard not enabled, do you see this issue?
>>> 2. on a plain 3-way replicated volume (no arbiter), do you see this
>>> issue?
>>>
>>>
>>>
>>> On Tue, Jan 16, 2018 at 4:03 PM, Krutika Dhananjay <kdhananj at redhat.com>
>>> wrote:
>>>
>>>> Please share the volume-info output and the logs under
>>>> /var/log/glusterfs/ from all your nodes. for investigating the issue.
>>>>
>>>> -Krutika
>>>>
>>>> On Tue, Jan 16, 2018 at 1:30 PM, Ing. Luca Lazzeroni - Trend Servizi
>>>> Srl <luca at gvnet.it> wrote:
>>>>
>>>>> Hi to everyone.
>>>>>
>>>>> I've got a strange problem with a gluster setup: 3 nodes with Centos
>>>>> 7.4, Gluster 3.12.4 from Centos/Gluster repositories, QEMU-KVM version
>>>>> 2.9.0 (compiled from RHEL sources).
>>>>>
>>>>> I'm running volumes in replica 3 arbiter 1 mode (but I've got a volume
>>>>> in "pure" replica 3 mode too). I've applied the "virt" group settings to my
>>>>> volumes since they host VM images.
>>>>>
>>>>> If I try to install something (eg: Ubuntu Server 16.04.3) on a VM (and
>>>>> so I generate a bit of I/O inside it) and configure KVM to access gluster
>>>>> volume directly (via libvirt), install fails after a while because the disk
>>>>> content is corrupted. If I inspect the block inside the disk (by accessing
>>>>> the image directly from outside) I can found many files filled with "^@".
>>>>>
>>>>
>>> Also, what exactly do you mean by accessing the image directly from
>>> outside? Was it from the brick directories directly? Was it from the mount
>>> point of the volume? Could you elaborate? Which files exactly did you check?
>>>
>>> -Krutika
>>>
>>>
>>>>> If, instead, I configure KVM to access VM images via a FUSE mount,
>>>>> everything seems to work correctly.
>>>>>
>>>>> Note that the problem with install is verified 100% time with QCOW2
>>>>> image, while it appears only after with RAW disk images.
>>>>>
>>>>> Is there anyone who experienced the same problem ?
>>>>>
>>>>> Thank you,
>>>>>
>>>>>
>>>>> --
>>>>> Ing. Luca Lazzeroni
>>>>> Responsabile Ricerca e Sviluppo
>>>>> Trend Servizi Srl
>>>>> Tel: 0376/631761
>>>>> Web: https://www.trendservizi.it
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>
>>> --
>>> Ing. Luca Lazzeroni
>>> Responsabile Ricerca e Sviluppo
>>> Trend Servizi Srl
>>> Tel: 0376/631761
>>> Web: https://www.trendservizi.it
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>> --
>>> Ing. Luca Lazzeroni
>>> Responsabile Ricerca e Sviluppo
>>> Trend Servizi Srl
>>> Tel: 0376/631761
>>> Web: https://www.trendservizi.it
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>> --
>>> Ing. Luca Lazzeroni
>>> Responsabile Ricerca e Sviluppo
>>> Trend Servizi Srl
>>> Tel: 0376/631761
>>> Web: https://www.trendservizi.it
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>> --
>> Ing. Luca Lazzeroni
>> Responsabile Ricerca e Sviluppo
>> Trend Servizi Srl
>> Tel: 0376/631761
>> Web: https://www.trendservizi.it
>>
>>
>
> --
> Ing. Luca Lazzeroni
> Responsabile Ricerca e Sviluppo
> Trend Servizi Srl
> Tel: 0376/631761
> Web: https://www.trendservizi.it
>
>
>
> _______________________________________________
> Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> --
> Ing. Luca Lazzeroni
> Responsabile Ricerca e Sviluppo
> Trend Servizi Srl
> Tel: 0376/631761
> Web: https://www.trendservizi.it
>
>
>
> _______________________________________________
> Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> --
> Ing. Luca Lazzeroni
> Responsabile Ricerca e Sviluppo
> Trend Servizi Srl
> Tel: 0376/631761
> Web: https://www.trendservizi.it
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180118/c91b4d6e/attachment.html>
    
    
More information about the Gluster-users
mailing list