[Gluster-users] GlusterFS with OpenSolaris and ZFS
Phillip Steinbachs
pds at cgb.indiana.edu
Tue May 12 20:19:16 UTC 2009
Greetings,
We are beginning to experiment with GlusterFS and are having some problems
using it with OpenSolaris and ZFS. Our testing so far is limited to one
OpenSolaris system running glusterfsd 2.0.0 as a brick with a raidz2 ZFS
pool, and one 64-bit Ubuntu client using glusterfsclient 2.0.0 and the
FUSE patch.
Here's the config on our brick:
NAME USED AVAIL REFER MOUNTPOINT
data/cfs 23.5M 24.8T 54.7K /cfs
glusterfs-server.vol
---
volume brick
type storage/posix
option directory /cfs
end-volume
volume server
type protocol/server
subvolumes brick
option transport-type tcp
end-volume
glusterfs-client.vol
---
volume client
type protocol/client
option transport-type tcp
option remote-host 192.168.0.5
option remote-subvolume brick
end-volume
The first problem is basic read/write. On the Ubuntu client, we start
something like:
# iozone -i 0 -r 1m -s 64g -f /cfs/64gbtest
This will hang and die after awhile, usually within 10-15 minutes, with a
"fsync: Transport endpoint is not connected" error. Smaller tests of 1g
usually complete ok, but nothing above that.
The second problem concerns accessing hidden ZFS snapshot directories.
Let's say we take a snapshot on the brick, and start a simple write
operation on the client like this:
brick;
# zfs snapshot data/cfs at test
client:
# touch file{1..10000}
With this running, an 'ls /cfs/.zfs' causes immediate
"Transport endpoint is not connected" or "Function not implemented"
errors. The log on the brick shows errors like:
2009-05-06 22:48:55 W [posix.c:1351:posix_create] brick: open on
/.zfs/file1: Operation not applicable
2009-05-06 22:48:55 E [posix.c:751:posix_mknod] brick: mknod on
/.zfs/file1: Operation not applicable
2009-05-06 22:48:55 E [posix.c:751:posix_mknod] brick: mknod on
/.zfs/file2: Operation not applicable
2009-05-06 22:48:55 E [posix.c:751:posix_mknod] brick: mknod on
/.zfs/file3: Operation not applicable
...
It seems that once something tries to access the .zfs directory, the
client or brick thinks that /cfs is now /.zfs. Once this happens,
unmounting and remounting the filesystem on the client doesn't fix it:
# umount /cfs
# glusterfs -s 192.168.0.5 /cfs
# cd /cfs; touch test
touch: cannot touch `test': Function not implemented
2009-05-12 15:16:49 W [posix.c:1351:posix_create] brick: open on
/.zfs/test: Operation not applicable
2009-05-12 15:16:49 E [posix.c:751:posix_mknod] brick: mknod on
/.zfs/test: Operation not applicable
In one instance, performing this test caused a glusterfsd segfault:
2009-05-12 14:00:21 E [dict.c:2299:dict_unserialize] dict: undersized
buffer passsed
pending frames:
frame : type(1) op(LOOKUP)
patchset: 7b2e459db65edd302aa12476bc73b3b7a17b1410
signal received: 11
configuration details:backtrace 1
db.h 1
dlfcn 1
fdatasync 1
libpthread 1
spinlock 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.0
/lib/libc.so.1'__sighndlr+0xf [0xfed4cd5f]
/lib/libc.so.1'call_user_handler+0x2af [0xfed400bf]
/lib/libc.so.1'strlen+0x30 [0xfecc3ef0]
/lib/libc.so.1'vfprintf+0xa7 [0xfed12b8f]
/local/lib/libglusterfs.so.0.0.0'_gf_log+0x148 [0xfee11698]
/local/lib/glusterfs/2.0.0/xlator/protocol/server.so.0.0.0'server_lookup+0x3c3
[0xfe3f25d3]
/local/lib/glusterfs/2.0.0/xlator/protocol/server.so.0.0.0'protocol_server_interpret+0xc5
[0xfe3e6705]
/local/lib/glusterfs/2.0.0/xlator/protocol/server.so.0.0.0'protocol_server_pollin+0x97
[0xfe3e69a7]
/local/lib/glusterfs/2.0.0/xlator/protocol/server.so.0.0.0'notify+0x7f
[0xfe3e6a2f]
/local/lib/glusterfs/2.0.0/transport/socket.so.0.0.0'socket_event_poll_in+0x3b
[0xfe28416b]
/local/lib/glusterfs/2.0.0/transport/socket.so.0.0.0'socket_event_handler+0xa3
[0xfe284583]
/local/lib/libglusterfs.so.0.0.0'0x26c41 [0xfee26c41]
/local/lib/libglusterfs.so.0.0.0'event_dispatch+0x21 [0xfee26761]
/local/sbin/glusterfsd'0x3883 [0x804b883]
/local/sbin/glusterfsd'0x1f30 [0x8049f30]
---------
The only way to recover from this is to restart glusterfsd. I'm guessing
this is to be expected because the .zfs snapshot dir is a special case
and gluster has no knowledge of it. The concern for us right now is that
even with the .zfs directory hidden, someone can still accidentally try to
access it and cause the filesystem to become unavailable.
Finally, it appears that glusterfsd does asynchronous writes. Is it also
possible to do synchronous writes? We are experimenting with SSDs and
the ZFS intent log (ZIL) and would like to see if there is a difference
in performance.
Thanks.
-phillip
More information about the Gluster-users
mailing list