[Gluster-users] booster unfs with cluster/distribute doesn't work...
Liam Slusser
lslusser at gmail.com
Thu Jul 23 11:25:59 UTC 2009
Thanks Shehjar. I'll give those a try.
liam
On Thu, Jul 23, 2009 at 4:03 AM, Shehjar Tikoo <shehjart at gluster.com> wrote:
> Liam Slusser wrote:
>
>> I've been playing with booster unfs and found that i cannot get it to work
>> with a gluster config that uses cluster/distribute. I am using Gluster
>> 2.0.3...
>>
>
> Thanks. I've seen the stale handle errors while using both
> replicate and distribute. The fixes are in the repo but
> not part of a release yet. Release 2.0.5 will contain those
> changes. In the mean time, if you're really interested, you'd
> check out the repo as:
>
> $ git clone git://git.sv.gnu.org/gluster.git ./glusterfs
> $ cd glusterfs
> $ git checkout -b release2.0 origin/release-2.0
>
> Also, we've not yet announced it on the list but a customised version
> of unfs3 is available at:
>
> http://ftp.gluster.com/pub/gluster/glusterfs/misc/unfs3/0.5/unfs3-0.9.23booster0.5.tar.gz
>
> It has some bug fixes, performance enhancements and work-arounds
> to improve behaviour with booster.
>
> Some documentation is available at:
> http://www.gluster.org/docs/index.php/Unfs3boosterConfiguration
>
>
> Thanks
> Shehjar
>
>
>
>
>> [root at box01 /]# mount -t nfs store01:/intstore.booster -o
>> wsize=65536,rsize=65536 /mnt/store
>> mount: Stale NFS file handle
>>
>> (just trying it again and sometimes it will mount...)
>>
>> [root at box01 /]# mount -t nfs store01:/store.booster -o
>> wsize=65536,rsize=65536 /mnt/store
>> [root at box01 /]# ls /mnt/store
>> data
>> [root at box01 store]# cd /mnt/store/data
>> -bash: cd: /mnt/store/data/: Stale NFS file handle
>> [root at box01 /]# cd /mnt/store
>> [root at box01 store]# cd data
>> -bash: cd: data/: Stale NFS file handle
>> [root at box01 store]#
>>
>> Sometimes i can get df to show the actual cluster, but most times it gives
>> me nothing.
>>
>> [root at box01 /]# df -h
>> Filesystem Size Used Avail Use% Mounted on
>> <....>
>> store01:/store.booster
>> 90T 49T 42T 54% /mnt/store
>> [root at box01 /]#
>>
>> [root at box01 /]# df -h
>> Filesystem Size Used Avail Use% Mounted on
>> <...>
>> store01:/store.booster
>> - - - - /mnt/store
>>
>>
>> However as soon as i remove the cluster/distribute from my gluster client
>> configuration file it works fine. (Missing 2/3 of the files because my
>> gluster cluster has a "distribute" of 3 volumes per each of the two
>> servers)
>>
>> A strace of unfs during one of the cd commands above outputs:
>>
>> poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21,
>> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22,
>> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23,
>> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000) = 1 ([{fd=22,
>> revents=POLLIN|POLLRDNORM}])
>> poll([{fd=22, events=POLLIN}], 1, 35000) = 1 ([{fd=22, revents=POLLIN}])
>> read(22,
>>
>> "\200\0\0\230B\307D\234\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\3\0\0\0\4\0\0\0\1"...,
>> 4000) = 156
>> tgkill(4574, 4576, SIGRT_1) = 0
>> tgkill(4574, 4575, SIGRT_1) = 0
>> futex(0x7fff31c7cb20, FUTEX_WAIT_PRIVATE, 1, NULL) = 0
>> setresgid(-1, 0, -1) = 0
>> tgkill(4574, 4576, SIGRT_1) = 0
>> tgkill(4574, 4575, SIGRT_1) = 0
>> setresuid(-1, 0, -1) = 0
>> write(22, "\200\0\0
>> B\307D\234\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0F"..., 36) = 36
>> poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21,
>> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22,
>> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23,
>> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000) = 1 ([{fd=22,
>> revents=POLLIN|POLLRDNORM}])
>> poll([{fd=22, events=POLLIN}], 1, 35000) = 1 ([{fd=22, revents=POLLIN}])
>> read(22,
>>
>> "\200\0\0\230C\307D\234\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\3\0\0\0\4\0\0\0\1"...,
>> 4000) = 156
>> tgkill(4574, 4576, SIGRT_1) = 0
>> tgkill(4574, 4575, SIGRT_1) = 0
>> setresgid(-1, 0, -1) = 0
>> tgkill(4574, 4576, SIGRT_1) = 0
>> tgkill(4574, 4575, SIGRT_1) = 0
>> setresuid(-1, 0, -1) = 0
>> write(22, "\200\0\0
>> C\307D\234\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0F"..., 36) = 36
>> poll([{fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=21,
>> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=22,
>> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=23,
>> events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 4, 2000 <unfinished ...>
>>
>> With the booster.fstab debug level set a debug, this is all that shows up
>> in
>> the log file:
>>
>> [2009-07-23 02:52:16] D
>> [libglusterfsclient-dentry.c:381:libgf_client_path_lookup]
>> libglusterfsclient: resolved path(/) to 1/1
>> [2009-07-23 02:52:17] D [libglusterfsclient.c:1340:libgf_vmp_search_entry]
>> libglusterfsclient: VMP Entry found: /store.booster/: /store.booster/
>>
>> my /etc/booster.conf
>>
>> /home/gluster/apps/glusterfs-2.0.3/etc/glusterfs/liam.conf /store.booster/
>> glusterfs
>>
>> subvolume=d,logfile=/home/gluster/apps/glusterfs-2.0.3/var/log/glusterfs/d.log,loglevel=DEBUG,attr_timeout=0
>>
>> my /etc/exports
>>
>> /store.booster myclient(rw,no_root_squash)
>>
>> my client gluster config (liam.conf):
>>
>> volume brick1a
>> type protocol/client
>> option transport-type tcp
>> option remote-host server1
>> option remote-subvolume brick1a
>> end-volume
>>
>> volume brick1b
>> type protocol/client
>> option transport-type tcp
>> option remote-host server1
>> option remote-subvolume brick1b
>> end-volume
>>
>> volume brick1c
>> type protocol/client
>> option transport-type tcp
>> option remote-host server1
>> option remote-subvolume brick1c
>> end-volume
>>
>> volume brick2a
>> type protocol/client
>> option transport-type tcp
>> option remote-host server2
>> option remote-subvolume brick2a
>> end-volume
>>
>> volume brick2b
>> type protocol/client
>> option transport-type tcp
>> option remote-host server2
>> option remote-subvolume brick2b
>> end-volume
>>
>> volume brick2c
>> type protocol/client
>> option transport-type tcp
>> option remote-host server2
>> option remote-subvolume brick2c
>> end-volume
>>
>> volume bricks1
>> type cluster/replicate
>> subvolumes brick1a brick2a
>> end-volume
>>
>> volume bricks2
>> type cluster/replicate
>> subvolumes brick1b brick2b
>> end-volume
>>
>> volume bricks3
>> type cluster/replicate
>> subvolumes brick1c brick2c
>> end-volume
>>
>> volume distribute
>> type cluster/distribute
>> subvolumes bricks1 bricks2 bricks3
>> end-volume
>>
>> volume readahead
>> type performance/read-ahead
>> option page-size 2MB # unit in bytes
>> option page-count 16 # cache per file = (page-count x page-size)
>> subvolumes distribute
>> end-volume
>>
>> volume cache
>> type performance/io-cache
>> option cache-size 256MB
>> subvolumes readahead
>> end-volume
>>
>> volume d
>> type performance/write-behind
>> option cache-size 16MB
>> option flush-behind on
>> subvolumes cache
>> end-volume
>>
>> I've tried removing the performance translators with no change. Once i
>> remove distribute and only connect to one of the three bricks on a server
>> it
>> works perfect.
>>
>> I do have similar cluster that uses replicate but no distribute and it
>> works fine.
>>
>> ideas? This a bug?
>>
>> thanks,
>> liam
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>>
>
>
More information about the Gluster-users
mailing list