[Gluster-devel] Questions

Anand Babu Periasamy ab at gnu.org.in
Fri Apr 6 01:10:22 UTC 2007

Gerry Reno writes:
> Hows do GlusterFS behave in the following scenarios:
> =================================
>   In a multi-brick cluster using AFR a node goes down and then later
> is brought back online
>     GlusterFS sees the node restart and then begins syncing it's
> bricks from transaction log, once it is synced it is put back into
> the cluster.
> =================================
This is what self-heal functionality in 1.4 is supposed to do. Each
translator will contribute its piece of context-aware healing
functionality to the over all recovery process.

self-heal will involve multiple techniques. Key of them are
* journaled-recovery: It will maintain a journal of operations that
needs to be performed on a failed brick. For example dir related
operations, all I/O operations for AFR ... (This is exactly you
described above).
* lazy-recovery: Certain errors will be extremely time consuming to
detect. Instead of looking out for them (when the brick is offline),
GlusterFS will resume normal operation immediately. If it finds any
fault at run-time, self-heal will heal on demand (say duplicate
files.., missing directory on a brick..). It is OK if a dir is missing
in one of the brick, when it can be fixed at the time of access.
You can also initiate a forceful recovery by just triggering
faults (say "find  /mnt/glusterfs -type f -exec file {} \;" will
navigate the entire dir tree and access each file. This should be
sufficient to convert many lazy checks to instant ones). Then
glusterfs-fsck tool would be a matter of shell script.

> =================================
>   Expand/Contract a GlusterFS cluster.
>     GlusterFS allows cluster members to be dynamically
> hot-added/hot-removed from a running cluster.
> =================================
As of adding bricks requires restart of GlusterFS.

Hot-add/remove functionality is part of our road map. We are
introducing server-notification framework in 1.4. With this feature,
implementing hot-add/remove is a cake-walk.

Do you think this feature is important for 1.4?. I want to have 1.4
released as soon as possible..

