[Gluster-users] booster unfs with cluster/distribute doesn't work...
Liam Slusser
lslusser at gmail.com
Thu Jul 23 10:19:02 UTC 2009
I've been playing with booster unfs and found that i cannot get it to work
with a gluster config that uses cluster/distribute. I am using Gluster
2.0.3...
[root at box01 /]# mount -t nfs store01:/intstore.booster -o
wsize=65536,rsize=65536 /mnt/store
mount: Stale NFS file handle
(just trying it again and sometimes it will mount...)
[root at box01 /]# mount -t nfs store01:/store.booster -o
wsize=65536,rsize=65536 /mnt/store
[root at box01 /]# ls /mnt/store
data
[root at box01 store]# cd /mnt/store/data
-bash: cd: /mnt/store/data/: Stale NFS file handle
[root at box01 /]# cd /mnt/store
[root at box01 store]# cd data
-bash: cd: data/: Stale NFS file handle
[root at box01 store]#
Sometimes i can get df to show the actual cluster, but most times it gives
me nothing.
[root at box01 /]# df -h
Filesystem Size Used Avail Use% Mounted on
<....>
store01:/store.booster
90T 49T 42T 54% /mnt/store
[root at box01 /]#
[root at box01 /]# df -h
Filesystem Size Used Avail Use% Mounted on
<...>
store01:/store.booster
- - - - /mnt/store
However as soon as i remove the cluster/distribute from my gluster client
configuration file it works fine. (Missing 2/3 of the files because my
gluster cluster has a "distribute" of 3 volumes per each of the two servers)
A strace of unfs during one of the cd commands above outputs:
poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000) = 1 ([{fd=22,
revents=POLLIN|POLLRDNORM}])
poll([{fd=22, events=POLLIN}], 1, 35000) = 1 ([{fd=22, revents=POLLIN}])
read(22,
"\200\0\0\230B\307D\234\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\3\0\0\0\4\0\0\0\1"...,
4000) = 156
tgkill(4574, 4576, SIGRT_1) = 0
tgkill(4574, 4575, SIGRT_1) = 0
futex(0x7fff31c7cb20, FUTEX_WAIT_PRIVATE, 1, NULL) = 0
setresgid(-1, 0, -1) = 0
tgkill(4574, 4576, SIGRT_1) = 0
tgkill(4574, 4575, SIGRT_1) = 0
setresuid(-1, 0, -1) = 0
write(22, "\200\0\0
B\307D\234\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0F"..., 36) = 36
poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000) = 1 ([{fd=22,
revents=POLLIN|POLLRDNORM}])
poll([{fd=22, events=POLLIN}], 1, 35000) = 1 ([{fd=22, revents=POLLIN}])
read(22,
"\200\0\0\230C\307D\234\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\3\0\0\0\4\0\0\0\1"...,
4000) = 156
tgkill(4574, 4576, SIGRT_1) = 0
tgkill(4574, 4575, SIGRT_1) = 0
setresgid(-1, 0, -1) = 0
tgkill(4574, 4576, SIGRT_1) = 0
tgkill(4574, 4575, SIGRT_1) = 0
setresuid(-1, 0, -1) = 0
write(22, "\200\0\0
C\307D\234\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0F"..., 36) = 36
poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23,
events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000 <unfinished ...>
With the booster.fstab debug level set a debug, this is all that shows up in
the log file:
[2009-07-23 02:52:16] D
[libglusterfsclient-dentry.c:381:libgf_client_path_lookup]
libglusterfsclient: resolved path(/) to 1/1
[2009-07-23 02:52:17] D [libglusterfsclient.c:1340:libgf_vmp_search_entry]
libglusterfsclient: VMP Entry found: /store.booster/: /store.booster/
my /etc/booster.conf
/home/gluster/apps/glusterfs-2.0.3/etc/glusterfs/liam.conf /store.booster/
glusterfs
subvolume=d,logfile=/home/gluster/apps/glusterfs-2.0.3/var/log/glusterfs/d.log,loglevel=DEBUG,attr_timeout=0
my /etc/exports
/store.booster myclient(rw,no_root_squash)
my client gluster config (liam.conf):
volume brick1a
type protocol/client
option transport-type tcp
option remote-host server1
option remote-subvolume brick1a
end-volume
volume brick1b
type protocol/client
option transport-type tcp
option remote-host server1
option remote-subvolume brick1b
end-volume
volume brick1c
type protocol/client
option transport-type tcp
option remote-host server1
option remote-subvolume brick1c
end-volume
volume brick2a
type protocol/client
option transport-type tcp
option remote-host server2
option remote-subvolume brick2a
end-volume
volume brick2b
type protocol/client
option transport-type tcp
option remote-host server2
option remote-subvolume brick2b
end-volume
volume brick2c
type protocol/client
option transport-type tcp
option remote-host server2
option remote-subvolume brick2c
end-volume
volume bricks1
type cluster/replicate
subvolumes brick1a brick2a
end-volume
volume bricks2
type cluster/replicate
subvolumes brick1b brick2b
end-volume
volume bricks3
type cluster/replicate
subvolumes brick1c brick2c
end-volume
volume distribute
type cluster/distribute
subvolumes bricks1 bricks2 bricks3
end-volume
volume readahead
type performance/read-ahead
option page-size 2MB # unit in bytes
option page-count 16 # cache per file = (page-count x page-size)
subvolumes distribute
end-volume
volume cache
type performance/io-cache
option cache-size 256MB
subvolumes readahead
end-volume
volume d
type performance/write-behind
option cache-size 16MB
option flush-behind on
subvolumes cache
end-volume
I've tried removing the performance translators with no change. Once i
remove distribute and only connect to one of the three bricks on a server it
works perfect.
I do have similar cluster that uses replicate but no distribute and it
works fine.
ideas? This a bug?
thanks,
liam
More information about the Gluster-users
mailing list