[Gluster-devel] glusterfs-1.3.8pre1

Sat Feb 23 02:53:42 UTC 2008

>
> *Q1: *
>
> I've installed glusterfs-1.3.8pre1 on one node of a cluster running
> 1.3.7, but glusterfsd 1.3.8 drops incoming connections from 1.3.7
> clients. Is this by design ? Do I need to upgrade everything to
> 1.3.8pre1 at once ?

Yes, both client and server should be of the same version.

>
> *Q2: *
>
> The reasons for trying to upgrade to 1.3.8 are the following:
>
> Current configuration:
>
> - 16 clients / 16 servers (one client/server on each machine)
> - servers are dual opteron, some of them quad core, 8 or 12 gb ram
> - kernel 2.6.24-2, linux gentoo (can provide gluster ebuilds)
> - fuse 2.7.2glfs8, glusterfs 1.3.7 - see config files- basicly a simple
> unify with no ra/wt cache
>
> Configs are here: http://gluster.pastebin.com/m7f61927f
> All servers are stable and the problems below are in normal running
> conditions.
>
> Inside the gluster filesystem we store ~3 million pictures, in a
> directory tree that guarantees up
> to 1k pictures or subdirectories per directory, with ~30 writes per
> second, and ~300 reads per
> second. Files are relatively small, 4-5k/picture.
>
>
> 1. glusterfs (client) appears to memory leak in our configuration - 300
> mb RAM eaten over 2 days.
>
> 2. frequent files with size 0, ctime 0 (1970) even if all servers are up
> and running.
>
> 3. occasional files with correct size/ctime that cannot be read, and
> sometimes they can be
> read from other servers.

this is related to the log entries and not being able to open the files. the
error is caused because a given filename exists in more than one storage
node.

4. back when I was using AFR for mirrored namespace (which I gave up,
> trying to alleviate the
> other errors), crash in AFR in glusterfs (client) when one of the
> servers was shutting down.

was this with 1.3.8pre1? lot of fixes have gone in since 1.3.7, please let
us know if you are facing issues in a 1.3.pre1-only environment.

>
> *Q3:*
>
> Nice-to-haves:
>
> 1. Redundant namespace => no single point of failure.

Isnt and AFR'ed ns already doing that?

2. A way to see a diagram of the cluster, it's connected nodes, etc. for
> anyone running more than 2-3 servers - pulling live data from one of the
> servers/clients.

We are working a web UI for graphical viewing/editing of volume files in
1.3.9

avati