[Gluster-users] Migrating a VM makes its gluster storage inaccessible
Paul Boven
boven at jive.nl
Tue Jan 21 16:12:03 UTC 2014
Hi Josh, everyone,
Glad you're trying to help, so no need to apologize at all.
mount output:
/dev/sdb1 on /export/brick0 type xfs (rw)
localhost:/gv0 on /gluster type fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072)
gluster volume info all:
Volume Name: gv0
Type: Replicate
Volume ID: ee77a036-50c7-4a41-b10d-cc0703769df9
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.88.4.0:/export/brick0/sdb1
Brick2: 10.88.4.1:/export/brick0/sdb1
Options Reconfigured:
diagnostics.client-log-level: INFO
diagnostics.brick-log-level: INFO
Regards, Paul Boven.
On 01/21/2014 05:02 PM, Josh Boon wrote:
> Hey Paul,
>
> Definitely looks to be gluster. Sorry about the wrong guess on UID/GID. What's the output of "mount" and "gluster volume info all"?
>
> Best,
> Josh
>
>
> ----- Original Message -----
> From: "Paul Boven" <boven at jive.nl>
> To: gluster-users at gluster.org
> Sent: Tuesday, January 21, 2014 10:56:34 AM
> Subject: Re: [Gluster-users] Migrating a VM makes its gluster storage inaccessible
>
> Hi Josh,
>
> I've taken great care that /etc/passwd and /etc/group are the same on
> both machines. When the problem occurs, even root gets 'permission
> denied' when trying to read /gluster/guest.raw. So my first reaction was
> that it cannot be a uid problem.
>
> In the normal situation, the storage for a running guest is owned by
> libvirt-qemu:kvm. When I shut a guest down (virsh destroy), the
> ownership changes to root:root on both cluster servers.
>
> During a migration (that fails), the ownership also ends up as root:root
> on both, which I hadn't noticed before. Filemode is 0644.
>
> On the originating server, root can still read /gluster/guest.raw,
> whereas on the destination, this gives me 'permission denied'.
>
> The qemu logfile for the guest doesn't show much interesting
> information, merely 'shutting down' on the originating server, and the
> startup on de destination server. Libvirt/qemu does not seem to be aware
> of the situation that the guest ends up in. I'll post the gluster logs
> somewhere, too.
>
> From the destination server:
>
> LC_ALL=C
> PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin
> /usr/bin/kvm -name kvmtest -S -M pc-i440fx-1.4 -m 1024 -smp
> 1,sockets=1,cores=1,threads=1 -uuid 97db2d3f-c8e4-31de-9f89-848356b20da5
> -nographic -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/kvmtest.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
> -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
> file=/gluster/kvmtest.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=29 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:01:01:11,bus=pci.0,addr=0x3
> -chardev pty,id=charserial0 -device
> isa-serial,chardev=charserial0,id=serial0 -incoming tcp:0.0.0.0:49166
> -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
> W: kvm binary is deprecated, please use qemu-system-x86_64 instead
> char device redirected to /dev/pts/4 (label charserial0)
>
> Regards, Paul Boven.
>
>
>
>
>
>
> On 01/21/2014 04:22 PM, Josh Boon wrote:
>>
>> Paul,
>>
>> Sounds like a potential uid/gid problem. Would you be able to update with the logs from cd /var/log/libvirt/qemu/ for the guest from both source and destination? Also the gluster logs for the volume would be awesome.
>>
>>
>> Best,
>> Josh
>>
>> ----- Original Message -----
>> From: "Paul Boven" <boven at jive.nl>
>> To: gluster-users at gluster.org
>> Sent: Tuesday, January 21, 2014 9:36:06 AM
>> Subject: Re: [Gluster-users] Migrating a VM makes its gluster storage inaccessible
>>
>> Hi James,
>>
>> Thanks for the quick reply.
>>
>> We are only using the fuse mounted paths at the moment. So libvirt/qemu
>> simply know of these files as /gluster/guest.raw, and the guests are not
>> aware of libgluster.
>>
>> Some version numbers:
>>
>> Kernel: Ubuntu 3.8.0-35-generic (13.10, Raring)
>> Glusterfs: 3.4.1-ubuntu1~raring1
>> qemu: 1.4.0+dfsg-1expubuntu4
>> libvirt0: 1.0.2-0ubuntu11.13.04.4
>> The gluster bricks are on xfs.
>>
>> Regards, Paul Boven.
>>
>>
>> On 01/21/2014 03:25 PM, James wrote:
>>> Are you using the qemu gluster:// storage or are you using a fuse
>>> mounted file path?
>>>
>>> I would actually expect it to work with either, however I haven't had
>>> a chance to test this yet.
>>>
>>> It's probably also useful if you post your qemu versions...
>>>
>>> James
>>>
>>> On Tue, Jan 21, 2014 at 9:15 AM, Paul Boven <boven at jive.nl> wrote:
>>>> Hi everyone
>>>>
>>>> We've been running glusterfs-3.4.0 on Ubuntu 13.04, using semiosis'
>>>> packages. We're using kvm (libvrt) to host guest installs, and thanks to
>>>> gluster and libvirt, we can live-migrate guests between the two hosts.
>>>>
>>>> Recently I ran an apt-get update/upgrade to stay up-to-date with security
>>>> patches, and this also upgraded our glusterfs to the 3.4.1 version of the
>>>> packages.
>>>>
>>>> Since this upgrade (which updated the gluster packages, but also the Ubuntu
>>>> kernel package), kvm live migration fails in a most unusual manner. The live
>>>> migration itself succeeds, but on the receiving machine, the vm-storage for
>>>> that machine becomes inaccessible. Which in turn causes the guest OS to no
>>>> longer be able to read or write its filesystem, with of course fairly
>>>> disastrous consequences for such a guest.
>>>>
>>>> So before a migration, everything is running smoothly. The two cluster nodes
>>>> are 'cl0' and 'cl1', and we do the migration like this:
>>>>
>>>> virsh migrate --live --persistent --undefinesource <guest>
>>>> qemu+tls://cl1/system
>>>>
>>>> The migration itself works, but soon as you do the migration, the
>>>> /gluster/guest.raw file (which holds the filesystem for the guest) becomes
>>>> completely inaccessible: trying to read it (e.g. with dd or md5sum) results
>>>> in a 'permission denied' on the destination cluster node, whereas the file
>>>> is still perfectly fine on the machine that the migration originated from.
>>>>
>>>> As soon as I stop the guest (virsh destroy), the /gluster/guest.raw file
>>>> becomes readable again and I can start up the guest on either server without
>>>> further issues. It does not affect any of the other files in /gluster/.
>>>>
>>>> The problem seems to be in the gluster or fuse part, because once this error
>>>> condition is triggered, the /gluster/guest.raw cannot be read by any
>>>> application on the destination server. This situation is 100% reproducible,
>>>> every attempted live migration fails in this way.
>>>>
>>>> Has anyone else experienced this? Is this a known or new bug?
>>>>
>>>> We've done some troubleshooting already in the irc channel (thanks to
>>>> everyone for their help) but haven't found the smoking gun yet. I would
>>>> appreciate any help in debugging and resolving this.
>>>>
>>>> Regards, Paul Boven.
>>>> --
>>>> Paul Boven <boven at jive.nl> +31 (0)521-596547
>>>> Unix/Linux/Networking specialist
>>>> Joint Institute for VLBI in Europe - www.jive.nl
>>>> VLBI - It's a fringe science
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
>>
>
>
--
Paul Boven <boven at jive.nl> +31 (0)521-596547
Unix/Linux/Networking specialist
Joint Institute for VLBI in Europe - www.jive.nl
VLBI - It's a fringe science
More information about the Gluster-users
mailing list