[Gluster-users] error: "Transport endpoint is not connected" and "Stale NFS file handle"

Tue Mar 2 16:19:56 UTC 2010

José,
I guess my question is why?
If these are local drives you can use software raid (which will be MUCH faster) to do raid0 or raid5.
If these are not local drives what benefit do you think you are getting by adding an additional layer of replication?
As I see it yo will just kill your performance, as it seems to me that network FSs are about 10-25% of the speed of local file systems.
Adding the 2nd layer would mean 1-2.5% of the speed of the original.

^C

José Manuel Canelas wrote:
> Hi,
> 
> Since no one replies to this, i'll reply to myself :)
> 
> I just realized I assumed that it is possible to replicate distributed
> volumes. I am wrong?
> 
> In my setup bellow I was trying to make "Replicated Distributed
> Storage", the inverse of what is described in
> http://www.gluster.com/community/documentation/index.php/Distributed_Replicated_Storage.
> 
> Trying to draw a picture:
> 
> 	replicated
> -------------|------------  <----> 3 replicas presented as one volume
> replica1 replica2 replica3
> ---|---------|-------|----  <-----> 4 volumes, distributed, to make up
> 4vols	   4vols   4vols	each of the 3 volumes to be replicated
> 
> Is this dumb or is there a better way?
> 
> thanks,
> José Canelas
> 
> On 02/26/2010 03:55 PM, José Manuel Canelas wrote:
>> Hello, everyone.
>>
>> We're setting up GlusterFS for some testing and having some trouble with
>> the configuration.
>>
>> We have 4 nodes as clients and servers, 4 disks each. I'm trying to
>> setup 3 replicas across all those 16 disks, configured at the client
>> side, for high availability and optimal performance, in a way that makes
>> it easy to add new disks and nodes.
>>
>> The best way I thought doing it was to put disks together from different
>> nodes into 3 distributed volumes and then use each of those as a replica
>> of the top volume. I'd like your input on this too, so if you look at
>> the configuration and something looks wrong or dumb, it probably is, so
>> please let me know :)
>>
>> Now the server config looks like this:
>>
>> volume posix1
>>   type storage/posix
>>   option directory /srv/gdisk01
>> end-volume
>>
>> volume locks1
>>     type features/locks
>>     subvolumes posix1
>> end-volume
>>
>> volume brick1
>>     type performance/io-threads
>>     option thread-count 8
>>     subvolumes locks1
>> end-volume
>>
>> [4 more identical bricks and...]
>>
>> volume server-tcp
>>     type protocol/server
>>     option transport-type tcp
>>     option auth.addr.brick1.allow *
>>     option auth.addr.brick2.allow *
>>     option auth.addr.brick3.allow *
>>     option auth.addr.brick4.allow *
>>     option transport.socket.listen-port 6996
>>     option transport.socket.nodelay on
>>     subvolumes brick1 brick2 brick3 brick4
>> end-volume
>>
>>
>> The client config:
>>
>> volume node01-1
>>     type protocol/client
>>     option transport-type tcp
>>     option remote-host node01
>>     option transport.socket.nodelay on
>>     option transport.remote-port 6996
>>     option remote-subvolume brick1
>> end-volume
>>
>> [repeated for every brick, until node04-4]
>>
>> ### Our 3 replicas
>> volume repstore1
>>     type cluster/distribute
>>     subvolumes node01-1 node02-1 node03-1 node04-1 node04-4
>> end-volume
>>
>> volume repstore2
>>     type cluster/distribute
>>     subvolumes node01-2 node02-2 node03-2 node04-2 node02-2
>> end-volume
>>
>> volume repstore3
>>     type cluster/distribute
>>     subvolumes node01-3 node02-3 node03-3 node04-3 node03-3
>> end-volume
>>
>> volume replicate
>>     type cluster/replicate
>>     subvolumes repstore1 repstore2 repstore3
>> end-volume
>>
>> [and then the performance bits]
>>
>>
>> When starting the glusterfs server, everything looks fine. I then mount
>> the filesystem with
>>
>> node01:~# glusterfs --debug -f /etc/glusterfs/glusterfs.vol
>> /srv/gluster-export
>>
>> and it does not complain and shows up as properly mounted. When
>> accessing the content, it gives back an error, that the "Transport
>> endpoint is not connected". The log has a "Stale NFS file handle"
>> warning. See bellow:
>>
>> [...]
>> [2010-02-26 14:56:01] D [dht-common.c:274:dht_revalidate_cbk] repstore3:
>> mismatching layouts for /
>> [2010-02-26 14:56:01] W [fuse-bridge.c:722:fuse_attr_cbk]
>> glusterfs-fuse: 9: LOOKUP() / => -1 (Stale NFS file handle)
>>
>>
>> node01:~# mount
>> /dev/cciss/c0d0p1 on / type ext3 (rw,errors=remount-ro)
>> tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
>> proc on /proc type proc (rw,noexec,nosuid,nodev)
>> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
>> procbususb on /proc/bus/usb type usbfs (rw)
>> udev on /dev type tmpfs (rw,mode=0755)
>> tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
>> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
>> fusectl on /sys/fs/fuse/connections type fusectl (rw)
>> /dev/cciss/c0d1 on /srv/gdisk01 type ext3 (rw,errors=remount-ro)
>> /dev/cciss/c0d2 on /srv/gdisk02 type ext3 (rw,errors=remount-ro)
>> /dev/cciss/c0d3 on /srv/gdisk03 type ext3 (rw,errors=remount-ro)
>> /dev/cciss/c0d4 on /srv/gdisk04 type ext3 (rw,errors=remount-ro)
>> /etc/glusterfs/glusterfs.vol on /srv/gluster-export type fuse.glusterfs
>> (rw,allow_other,default_permissions,max_read=131072)
>> node01:~# ls /srv/gluster-export
>> ls: cannot access /srv/gluster-export: Transport endpoint is not connected
>> node01:~#
>>
>>
>> The complete debug log and configuration files are attached.
>>
>> Thank you in advance,
>> José Canelas
>>
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> 
>