[Gluster-users] Migrating a VM makes its gluster storage inaccessible

Paul Robert Marino prmarino1 at gmail.com
Wed Jan 22 16:20:17 UTC 2014


are you doing any thin like NATing the VM's on the physical host, or
do you have any Iptables forward rules on the physical host.
if so you may have a connection tracking issue.
there are a couple of ways you can fix that if thats the case the
easiest of which is to install conntrackd on the physical hosts and
configure it to sync directly into the live connection tracking table;
however it does limit your scaling capabilities for your fail over
zones.
The second is not to do that any more.



On Wed, Jan 22, 2014 at 9:38 AM, Paul Boven <boven at jive.nl> wrote:
> Hi Josh, everyone,
>
> I've just tried the server.allow-insecure option, and it makes no
> difference.
>
> You can find a summary and the logfiles at this URL:
>
> http://epboven.home.xs4all.nl/gluster-migrate.html
>
> The migration itself happens at 14:00:00, with the first write access
> attempt by the migrated guest at 14:00:25 which results in the 'permission
> denied' errors in the gluster.log. Some highlights from gluster.log:
>
> [2014-01-22 14:00:00.779741] D
> [afr-common.c:131:afr_lookup_xattr_req_prepare] 0-gv0-replicate-0:
> /kvmtest.raw: failed to get the gfid from dict
>
> [2014-01-22 14:00:00.780458] D
> [afr-common.c:1380:afr_lookup_select_read_child] 0-gv0-replicate-0: Source
> selected as 1 for /kvmtest.raw
>
> [2014-01-22 14:00:25.176181] W [client-rpc-fops.c:471:client3_3_open_cbk]
> 0-gv0-client-1: remote operation failed: Permission denied. Path:
> /kvmtest.raw (f7ed9edd-c6bd-4e86-b448-1d98bb38314b)
>
> [2014-01-22 14:00:25.176322] W [fuse-bridge.c:2167:fuse_writev_cbk]
> 0-glusterfs-fuse: 2494829: WRITE => -1 (Permission denied)
>
> Regards, Paul Boven.
>
>
> On 01/21/2014 05:35 PM, Josh Boon wrote:
>>
>> Hey Paul,
>>
>>
>> Have you tried server.allow-insecure: on as a volume option? If that
>> doesn't work we'll need the logs for both bricks.
>>
>> Best,
>> Josh
>>
>> ----- Original Message -----
>> From: "Paul Boven" <boven at jive.nl>
>> To: gluster-users at gluster.org
>> Sent: Tuesday, January 21, 2014 11:12:03 AM
>> Subject: Re: [Gluster-users] Migrating a VM makes its gluster storage
>> inaccessible
>>
>> Hi Josh, everyone,
>>
>> Glad you're trying to help, so no need to apologize at all.
>>
>> mount output:
>> /dev/sdb1 on /export/brick0 type xfs (rw)
>>
>> localhost:/gv0 on /gluster type fuse.glusterfs
>> (rw,default_permissions,allow_other,max_read=131072)
>>
>> gluster volume info all:
>> Volume Name: gv0
>> Type: Replicate
>> Volume ID: ee77a036-50c7-4a41-b10d-cc0703769df9
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: 10.88.4.0:/export/brick0/sdb1
>> Brick2: 10.88.4.1:/export/brick0/sdb1
>> Options Reconfigured:
>> diagnostics.client-log-level: INFO
>> diagnostics.brick-log-level: INFO
>>
>> Regards, Paul Boven.
>>
>>
>>
>>
>> On 01/21/2014 05:02 PM, Josh Boon wrote:
>>>
>>> Hey Paul,
>>>
>>> Definitely looks to be gluster. Sorry about the wrong guess on UID/GID.
>>> What's the output of "mount" and "gluster volume info all"?
>>>
>>> Best,
>>> Josh
>>>
>>>
>>> ----- Original Message -----
>>> From: "Paul Boven" <boven at jive.nl>
>>> To: gluster-users at gluster.org
>>> Sent: Tuesday, January 21, 2014 10:56:34 AM
>>> Subject: Re: [Gluster-users] Migrating a VM makes its gluster storage
>>> inaccessible
>>>
>>> Hi Josh,
>>>
>>> I've taken great care that /etc/passwd and /etc/group are the same on
>>> both machines. When the problem occurs, even root gets 'permission
>>> denied' when trying to read /gluster/guest.raw. So my first reaction was
>>> that it cannot be a uid problem.
>>>
>>> In the normal situation, the storage for a running guest is owned by
>>> libvirt-qemu:kvm. When I shut a guest down (virsh destroy), the
>>> ownership changes to root:root on both cluster servers.
>>>
>>> During a migration (that fails), the ownership also ends up as root:root
>>> on both, which I hadn't noticed before. Filemode is 0644.
>>>
>>> On the originating server, root can still read /gluster/guest.raw,
>>> whereas on the destination, this gives me 'permission denied'.
>>>
>>> The qemu logfile for the guest doesn't show much interesting
>>> information, merely 'shutting down' on the originating server, and the
>>> startup on de destination server. Libvirt/qemu does not seem to be aware
>>> of the situation that the guest ends up in. I'll post the gluster logs
>>> somewhere, too.
>>>
>>>    From the destination server:
>>>
>>> LC_ALL=C
>>> PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin
>>> /usr/bin/kvm -name kvmtest -S -M pc-i440fx-1.4 -m 1024 -smp
>>> 1,sockets=1,cores=1,threads=1 -uuid 97db2d3f-c8e4-31de-9f89-848356b20da5
>>> -nographic -no-user-config -nodefaults -chardev
>>>
>>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/kvmtest.monitor,server,nowait
>>> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
>>> -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
>>>
>>> file=/gluster/kvmtest.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none
>>> -device
>>>
>>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>>> -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=29 -device
>>>
>>> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:01:01:11,bus=pci.0,addr=0x3
>>> -chardev pty,id=charserial0 -device
>>> isa-serial,chardev=charserial0,id=serial0 -incoming tcp:0.0.0.0:49166
>>> -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
>>> W: kvm binary is deprecated, please use qemu-system-x86_64 instead
>>> char device redirected to /dev/pts/4 (label charserial0)
>>>
>>> Regards, Paul Boven.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 01/21/2014 04:22 PM, Josh Boon wrote:
>>>>
>>>>
>>>> Paul,
>>>>
>>>> Sounds like a potential uid/gid problem.  Would you be able to update
>>>> with the logs from cd /var/log/libvirt/qemu/ for the guest from both source
>>>> and destination? Also the gluster logs for the volume would be awesome.
>>>>
>>>>
>>>> Best,
>>>> Josh
>>>>
>>>> ----- Original Message -----
>>>> From: "Paul Boven" <boven at jive.nl>
>>>> To: gluster-users at gluster.org
>>>> Sent: Tuesday, January 21, 2014 9:36:06 AM
>>>> Subject: Re: [Gluster-users] Migrating a VM makes its gluster storage
>>>> inaccessible
>>>>
>>>> Hi James,
>>>>
>>>> Thanks for the quick reply.
>>>>
>>>> We are only using the fuse mounted paths at the moment. So libvirt/qemu
>>>> simply know of these files as /gluster/guest.raw, and the guests are not
>>>> aware of libgluster.
>>>>
>>>> Some version numbers:
>>>>
>>>> Kernel: Ubuntu 3.8.0-35-generic (13.10, Raring)
>>>> Glusterfs: 3.4.1-ubuntu1~raring1
>>>> qemu: 1.4.0+dfsg-1expubuntu4
>>>> libvirt0: 1.0.2-0ubuntu11.13.04.4
>>>> The gluster bricks are on xfs.
>>>>
>>>> Regards, Paul Boven.
>>>>
>>>>
>>>> On 01/21/2014 03:25 PM, James wrote:
>>>>>
>>>>> Are you using the qemu gluster:// storage or are you using a fuse
>>>>> mounted file path?
>>>>>
>>>>> I would actually expect it to work with either, however I haven't had
>>>>> a chance to test this yet.
>>>>>
>>>>> It's probably also useful if you post your qemu versions...
>>>>>
>>>>> James
>>>>>
>>>>> On Tue, Jan 21, 2014 at 9:15 AM, Paul Boven <boven at jive.nl> wrote:
>>>>>>
>>>>>> Hi everyone
>>>>>>
>>>>>> We've been running glusterfs-3.4.0 on Ubuntu 13.04, using semiosis'
>>>>>> packages. We're using kvm (libvrt) to host guest installs, and thanks
>>>>>> to
>>>>>> gluster and libvirt, we can live-migrate guests between the two hosts.
>>>>>>
>>>>>> Recently I ran an apt-get update/upgrade to stay up-to-date with
>>>>>> security
>>>>>> patches, and this also upgraded our glusterfs to the 3.4.1 version of
>>>>>> the
>>>>>> packages.
>>>>>>
>>>>>> Since this upgrade (which updated the gluster packages, but also the
>>>>>> Ubuntu
>>>>>> kernel package), kvm live migration fails in a most unusual manner.
>>>>>> The live
>>>>>> migration itself succeeds, but on the receiving machine, the
>>>>>> vm-storage for
>>>>>> that machine becomes inaccessible. Which in turn causes the guest OS
>>>>>> to no
>>>>>> longer be able to read or write its filesystem, with of course fairly
>>>>>> disastrous consequences for such a guest.
>>>>>>
>>>>>> So before a migration, everything is running smoothly. The two cluster
>>>>>> nodes
>>>>>> are 'cl0' and 'cl1', and we do the migration like this:
>>>>>>
>>>>>> virsh migrate --live --persistent --undefinesource <guest>
>>>>>> qemu+tls://cl1/system
>>>>>>
>>>>>> The migration itself works, but soon as you do the migration, the
>>>>>> /gluster/guest.raw file (which holds the filesystem for the guest)
>>>>>> becomes
>>>>>> completely inaccessible: trying to read it (e.g. with dd or md5sum)
>>>>>> results
>>>>>> in a 'permission denied' on the destination cluster node, whereas the
>>>>>> file
>>>>>> is still perfectly fine on the machine that the migration originated
>>>>>> from.
>>>>>>
>>>>>> As soon as I stop the guest (virsh destroy), the /gluster/guest.raw
>>>>>> file
>>>>>> becomes readable again and I can start up the guest on either server
>>>>>> without
>>>>>> further issues. It does not affect any of the other files in
>>>>>> /gluster/.
>>>>>>
>>>>>> The problem seems to be in the gluster or fuse part, because once this
>>>>>> error
>>>>>> condition is triggered, the /gluster/guest.raw cannot be read by any
>>>>>> application on the destination server. This situation is 100%
>>>>>> reproducible,
>>>>>> every attempted live migration fails in this way.
>>>>>>
>>>>>> Has anyone else experienced this? Is this a known or new bug?
>>>>>>
>>>>>> We've done some troubleshooting already in the irc channel (thanks to
>>>>>> everyone for their help) but haven't found the smoking gun yet. I
>>>>>> would
>>>>>> appreciate any help in debugging and resolving this.
>>>>>>
>>>>>> Regards, Paul Boven.
>>>>>> --
>>>>>> Paul Boven <boven at jive.nl> +31 (0)521-596547
>>>>>> Unix/Linux/Networking specialist
>>>>>> Joint Institute for VLBI in Europe - www.jive.nl
>>>>>> VLBI - It's a fringe science
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
> --
> Paul Boven <boven at jive.nl> +31 (0)521-596547
> Unix/Linux/Networking specialist
> Joint Institute for VLBI in Europe - www.jive.nl
> VLBI - It's a fringe science
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users



More information about the Gluster-users mailing list