[Gluster-devel] problems running glusterfs 2.5 patch 800 and xen

Jordi Moles Blanco jordi at cdmon.com
Thu Dec 4 18:38:25 UTC 2008


hi to everyone,

i'm having trouble running Xen virtual machines on a glusterfs 2.5, 
patch 800.

i've got two xen servers, version 3.2 that store their machines on 
gluster. They are debian-lenny distros. I also have 3 nodes which 
provide the storage unit with glusterfs, also lenny distros.

the thing is that when i ran "./configure --enable-kernel-module" for 
fuse 2.7.3glfs10 on server's side, i got this:

***********
warning: fuse module is already present on kernel, it won't compile
***********

so...

i ran:

*********
./configure
make
make install
**********

when compiling glusterfs--patch-800 i didn't get any error or warning 
message at all.

On Xen's side, i ran the proposed configure with "enable-fuse-client" 
and so on, and i got no problems either.

anyway...

nodes have this specs:

***************

volume esp
	type storage/posix
	option directory /glu0/data
end-volume

volume espai
	type performance/io-threads
	option thread-count 15
	option cache-size 512MB
	subvolumes esp
end-volume

volume nm
	type storage/posix
	option directory /glu0/ns
end-volume

volume ultim
    type protocol/server
    subvolumes espai nm
    option transport-type tcp/server
    option auth.ip.espai.allow *
    option auth.ip.nm.allow *
end-volume


***************

and Xen have these specs:

***********

volume espai1
        type protocol/client
        option transport-type tcp/client
        option remote-host 10.0.0.3
        option remote-subvolume espai
end-volume

volume espai2
        type protocol/client
        option transport-type tcp/client
        option remote-host 10.0.0.4
        option remote-subvolume espai
end-volume

volume espai3
        type protocol/client
        option transport-type tcp/client
        option remote-host 10.0.0.5
        option remote-subvolume espai
end-volume

volume namespace1
        type protocol/client
        option transport-type tcp/client
        option remote-host 10.0.0.4
        option remote-subvolume nm
end-volume

volume namespace2
        type protocol/client
        option transport-type tcp/client
        option remote-host 10.0.0.5
        option remote-subvolume nm
end-volume

volume grup1
        type cluster/afr
        subvolumes espai1 espai3
end-volume

volume grup2
        type cluster/afr
        subvolumes espai2
end-volume

volume nm
        type cluster/afr
        subvolumes namespace1 namespace2
end-volume

volume g01
        type cluster/unify
        subvolumes grup1 grup2
        option scheduler rr
        option namespace nm
end-volume

volume io-cache 
        type performance/io-cache 
        option cache-size 512MB 
        option page-size 1MB
        option force-revalidate-timeout 2 
        subvolumes g01
end-volume  


***********

so... everything seams to work fine at first, Xens are able to mount the 
glusterfs unit, but after a few seconds... i keep getting this on Xen's 
side:


*********
2008-12-04 18:48:56 E [client-protocol.c:4579:client_checksum] espai2: 
/domains: returning EINVAL
2008-12-04 18:48:56 E [client-protocol.c:4579:client_checksum] espai2: 
/domains/xen-gluton02: returning EINVAL
*********

there's no more log about the problem, just that.

on node's side:

************
2008-12-04 19:48:50 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:48:51 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:48:53 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:48:56 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:49:01 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:49:09 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:49:22 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:49:43 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:50:17 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:51:00 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:51:01 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:51:12 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:51:12 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:51:15 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:52:41 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:55:05 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:55:59 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:56:00 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:56:10 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:56:14 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 19:58:58 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:00:59 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:00:59 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:01:03 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:01:06 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:05:15 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:05:59 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:06:00 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:06:09 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:06:13 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:10:59 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:11:00 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:11:08 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:11:12 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:15:25 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:15:59 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:16:00 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:16:05 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:16:12 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:20:59 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:21:00 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:21:07 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:21:12 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:26:00 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:26:00 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:26:07 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null
2008-12-04 20:26:13 E [server-protocol.c:6050:server_protocol_interpret] 
ultim: bound_xl is null

***********

At first i didn't pay attention to that because we can operate on the 
storage unit, we can for example move some GB into it... but the thing 
is that when i try to run the machine, it will freeze after a few 
seconds and this error i'm reporting will appear more often than before. 
However, i don't have to run a machine to make it appear, it does appear 
from the beginning.

finally... when i mount gluster from Xen, i do it this way:

**********
glusterfs -l /var/log/glusterfs/glusterfs.log -L WARNING -d disable -f 
/etc/glusterfs/glusterfs-client.vol /mnt/glusterfs
**********

i mean, with "-d disable" option which seams to be appropiate for my setup.

and this is the point where my virtual machine freezes:

***********
[    1.104884] blkfront: sda2: barriers enabled
[    1.189612] XENBUS: Device with no driver: device/console/0
[    1.189620] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[    1.189630] Freeing unused kernel memory: 216k freed
[    1.461468] thermal: Unknown symbol acpi_processor_set_thermal_limit
[    2.047300] md: raid1 personality registered for level 1
[    2.089518] md: md0 stopped.
[    2.091932] md: md1 stopped.
[    2.096341] md: md2 stopped.
[    2.244344] EXT3-fs: INFO: recovery required on readonly filesystem.
[    2.244358] EXT3-fs: write access will be enabled during recovery.
[    2.286384] kjournald starting.  Commit interval 5 seconds
[    2.286398] EXT3-fs: recovery complete.
[    2.287274] EXT3-fs: mounted filesystem with ordered data mode.
[    3.128883] Adding 524280k swap on /dev/sda1.  Priority:-1 extents:1 
across:524280k
[    3.208470] EXT3 FS on sda2, internal journal
[    3.641153] device-mapper: uevent: version 1.0.3
[    3.641208] device-mapper: ioctl: 4.13.0-ioctl (2007-10-18) 
initialised: dm-devel at redhat.com
[    4.756035] NET: Registered protocol family 10
[    4.756035] lo: Disabled Privacy Extensions
***********

so...

any idea on how to fix this?

i've read that there are other user with the "bound_xl is null" but 
there have been no fix so far.

and also.... looking at the specs files... are they appropiate for Xen? 
Should i use any other option/translator to improve performance?

Thanks.







More information about the Gluster-devel mailing list