[Gluster-users] GlusterFS v3.1.5 Stable Configuration

Mon Jul 18 18:59:12 UTC 2011

On Mon, Jul 18, 2011 at 10:53 AM, Remi Broemeling <remi at goclio.com> wrote:
> Hi,
>
> We've been using GlusterFS to manage shared files across a number of hosts
> in the past few months and have ran into a few problems -- basically one
> every month, roughly.  The problems are occasionally extremely difficult to
> track down to GlusterFS, as they often masquerade as something else in the
> application log files that we have.  The problems have been one instance of
> split-brain and then a number of instances of "stuck" files (i.e. any stat
> calls would block for an hour and then timeout with an error) as well as a
> couple instances of "ghost" files (remove the file, but GlusterFS continues
> to show it for a little while until the cache times out).
>
> We do not place a large amount of load on GlusterFS, and don't have any
> significant performance issues to deal with.  With that in mind, the core
> question of this e-mail is: "How can I modify our configuration to be the
> absolute most stable (problem free) that it can be, even if it means
> sacrificing performance?"  In sum, I don't have any particular performance

It depends on kind of bugs or issues you are encountering. There might
be solution for some bugs and may not be for others.

> concerns at this moment, but the GlusterFS bugs that we encounter are quite
> problematic -- so I'm willing to entertain any suggested stability
> improvement, even if it has a negative impact on performance (I suspect that
> the answer here is just "turn off all performance-enhancing gluster
> caching", but I wanted to validate that is actually true before going so
> far).  Thus please suggest anything that could be done to improve the
> stability of our setup -- as an aside, I think that this would be an
> advantageous thing to add to the FAQ.  Right now the FAQ contains
> information for performance tuning, but not for stability tuning.
>
> Thanks for any help that you can give/suggestions that you can make.
>
> Here are the details of our environment:
>
> OS: RHEL5
> GlusterFS Version: 3.1.5
> Mount method: glusterfsd/FUSE
> GlusterFS Servers: web01, web02
> GlusterFS Clients: web01, web02, dj01, dj02
>
> $ sudo gluster volume info
>
> Volume Name: shared-application-data
> Type: Replicate
> Status: Started
> Number of Bricks: 2
> Transport-type: tcp
> Bricks:
> Brick1: web01:/var/glusterfs/bricks/shared
> Brick2: web02:/var/glusterfs/bricks/shared
> Options Reconfigured:
> network.ping-timeout: 5
> nfs.disable: on
>
> Configuration File Contents:
> /etc/glusterd/vols/shared-application-data/shared-application-data-fuse.vol
> volume shared-application-data-client-0
>     type protocol/client
>     option remote-host web01
>     option remote-subvolume /var/glusterfs/bricks/shared
>     option transport-type tcp
>     option ping-timeout 5
> end-volume
>
> volume shared-application-data-client-1
>     type protocol/client
>     option remote-host web02
>     option remote-subvolume /var/glusterfs/bricks/shared
>     option transport-type tcp
>     option ping-timeout 5
> end-volume
>
> volume shared-application-data-replicate-0
>     type cluster/replicate
>     subvolumes shared-application-data-client-0
> shared-application-data-client-1
> end-volume
>
> volume shared-application-data-write-behind
>     type performance/write-behind
>     subvolumes shared-application-data-replicate-0
> end-volume
>
> volume shared-application-data-read-ahead
>     type performance/read-ahead
>     subvolumes shared-application-data-write-behind
> end-volume
>
> volume shared-application-data-io-cache
>     type performance/io-cache
>     subvolumes shared-application-data-read-ahead
> end-volume
>
> volume shared-application-data-quick-read
>     type performance/quick-read
>     subvolumes shared-application-data-io-cache
> end-volume
>
> volume shared-application-data-stat-prefetch
>     type performance/stat-prefetch
>     subvolumes shared-application-data-quick-read
> end-volume
>
> volume shared-application-data
>     type debug/io-stats
>     subvolumes shared-application-data-stat-prefetch
> end-volume
>
> /etc/glusterfs/glusterd.vol
> volume management
>     type mgmt/glusterd
>     option working-directory /etc/glusterd
>     option transport-type socket,rdma
>     option transport.socket.keepalive-time 10
>     option transport.socket.keepalive-interval 2
> end-volume
>
> --
> Remi Broemeling
> System Administrator
> Clio - Practice Management Simplified
> 1-888-858-2546 x(2^5) | remi at goclio.com
> www.goclio.com | blog | twitter | facebook
>
>    ____
>  _⌠ oo ⌡_
> (_      _)
>   |    |
>   ⌡_⌡⌡_⌡
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>