[Gluster-devel] problems running glusterfs 2.5 patch 800 and xen
Jordi Moles Blanco
jordi at cdmon.com
Thu Dec 4 18:38:25 UTC 2008
hi to everyone,
i'm having trouble running Xen virtual machines on a glusterfs 2.5,
patch 800.
i've got two xen servers, version 3.2 that store their machines on
gluster. They are debian-lenny distros. I also have 3 nodes which
provide the storage unit with glusterfs, also lenny distros.
the thing is that when i ran "./configure --enable-kernel-module" for
fuse 2.7.3glfs10 on server's side, i got this:
***********
warning: fuse module is already present on kernel, it won't compile
***********
so...
i ran:
*********
./configure
make
make install
**********
when compiling glusterfs--patch-800 i didn't get any error or warning
message at all.
On Xen's side, i ran the proposed configure with "enable-fuse-client"
and so on, and i got no problems either.
anyway...
nodes have this specs:
***************
volume esp
type storage/posix
option directory /glu0/data
end-volume
volume espai
type performance/io-threads
option thread-count 15
option cache-size 512MB
subvolumes esp
end-volume
volume nm
type storage/posix
option directory /glu0/ns
end-volume
volume ultim
type protocol/server
subvolumes espai nm
option transport-type tcp/server
option auth.ip.espai.allow *
option auth.ip.nm.allow *
end-volume
***************
and Xen have these specs:
***********
volume espai1
type protocol/client
option transport-type tcp/client
option remote-host 10.0.0.3
option remote-subvolume espai
end-volume
volume espai2
type protocol/client
option transport-type tcp/client
option remote-host 10.0.0.4
option remote-subvolume espai
end-volume
volume espai3
type protocol/client
option transport-type tcp/client
option remote-host 10.0.0.5
option remote-subvolume espai
end-volume
volume namespace1
type protocol/client
option transport-type tcp/client
option remote-host 10.0.0.4
option remote-subvolume nm
end-volume
volume namespace2
type protocol/client
option transport-type tcp/client
option remote-host 10.0.0.5
option remote-subvolume nm
end-volume
volume grup1
type cluster/afr
subvolumes espai1 espai3
end-volume
volume grup2
type cluster/afr
subvolumes espai2
end-volume
volume nm
type cluster/afr
subvolumes namespace1 namespace2
end-volume
volume g01
type cluster/unify
subvolumes grup1 grup2
option scheduler rr
option namespace nm
end-volume
volume io-cache
type performance/io-cache
option cache-size 512MB
option page-size 1MB
option force-revalidate-timeout 2
subvolumes g01
end-volume
***********
so... everything seams to work fine at first, Xens are able to mount the
glusterfs unit, but after a few seconds... i keep getting this on Xen's
side:
*********
2008-12-04 18:48:56 E [client-protocol.c:4579:client_checksum] espai2:
/domains: returning EINVAL
2008-12-04 18:48:56 E [client-protocol.c:4579:client_checksum] espai2:
/domains/xen-gluton02: returning EINVAL
*********
there's no more log about the problem, just that.
on node's side:
************
2008-12-04 19:48:50 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:48:51 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:48:53 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:48:56 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:49:01 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:49:09 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:49:22 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:49:43 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:50:17 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:51:00 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:51:01 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:51:12 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:51:12 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:51:15 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:52:41 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:55:05 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:55:59 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:56:00 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:56:10 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:56:14 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 19:58:58 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:00:59 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:00:59 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:01:03 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:01:06 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:05:15 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:05:59 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:06:00 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:06:09 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:06:13 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:10:59 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:11:00 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:11:08 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:11:12 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:15:25 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:15:59 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:16:00 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:16:05 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:16:12 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:20:59 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:21:00 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:21:07 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:21:12 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:26:00 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:26:00 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:26:07 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
2008-12-04 20:26:13 E [server-protocol.c:6050:server_protocol_interpret]
ultim: bound_xl is null
***********
At first i didn't pay attention to that because we can operate on the
storage unit, we can for example move some GB into it... but the thing
is that when i try to run the machine, it will freeze after a few
seconds and this error i'm reporting will appear more often than before.
However, i don't have to run a machine to make it appear, it does appear
from the beginning.
finally... when i mount gluster from Xen, i do it this way:
**********
glusterfs -l /var/log/glusterfs/glusterfs.log -L WARNING -d disable -f
/etc/glusterfs/glusterfs-client.vol /mnt/glusterfs
**********
i mean, with "-d disable" option which seams to be appropiate for my setup.
and this is the point where my virtual machine freezes:
***********
[ 1.104884] blkfront: sda2: barriers enabled
[ 1.189612] XENBUS: Device with no driver: device/console/0
[ 1.189620] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[ 1.189630] Freeing unused kernel memory: 216k freed
[ 1.461468] thermal: Unknown symbol acpi_processor_set_thermal_limit
[ 2.047300] md: raid1 personality registered for level 1
[ 2.089518] md: md0 stopped.
[ 2.091932] md: md1 stopped.
[ 2.096341] md: md2 stopped.
[ 2.244344] EXT3-fs: INFO: recovery required on readonly filesystem.
[ 2.244358] EXT3-fs: write access will be enabled during recovery.
[ 2.286384] kjournald starting. Commit interval 5 seconds
[ 2.286398] EXT3-fs: recovery complete.
[ 2.287274] EXT3-fs: mounted filesystem with ordered data mode.
[ 3.128883] Adding 524280k swap on /dev/sda1. Priority:-1 extents:1
across:524280k
[ 3.208470] EXT3 FS on sda2, internal journal
[ 3.641153] device-mapper: uevent: version 1.0.3
[ 3.641208] device-mapper: ioctl: 4.13.0-ioctl (2007-10-18)
initialised: dm-devel at redhat.com
[ 4.756035] NET: Registered protocol family 10
[ 4.756035] lo: Disabled Privacy Extensions
***********
so...
any idea on how to fix this?
i've read that there are other user with the "bound_xl is null" but
there have been no fix so far.
and also.... looking at the specs files... are they appropiate for Xen?
Should i use any other option/translator to improve performance?
Thanks.
More information about the Gluster-devel
mailing list