[Gluster-devel] AFR and Selfheal
Krishna Srinivas
krishna at zresearch.com
Sun Jul 1 12:57:48 UTC 2007
Hi All,
Here is a breief writeup on AFR:
AFR volume is specified as
volume afr-example
type cluster/afr
subvolumes client1 client2 client3 client4
option replicate *pdf:4,*txt:3,*png:2,*:1
option self-heal on # by default this is on
option debug off # by default this is off, should be
turned on only during debugging as it produces huge logs
end-volume
As it can be inferred from "option replicate", pdf files will be
replicated across all 4 children,
text files across first 3 children (client1 client2 client3), png
files across first 2 children
(client1 client2) and rest of the files are put on 1st child i.e
client1. Note that AFR does
not schedule the creation of files in the case where number of replica
'n' is less than the
number of children., it will just create it among the first 'n' children.
Self-heal feature has been introduced in 1.3.0 pre5 release. Selfheal
tracks the file updation
using external attributes on the files. Hence xattr support is needed
by the backend filesystem.
Most filesystems (ext3 xfs reiserfs) support extended attributes.
You can check if your filesystem supports (or enabled) xattr by
following commands in the backend:
you can set an xattr like this:
setfattr -n user.test -v 123 file.c
you can get it like this:
getfattr -n user.test file.c
Each file will have two xattrs:
trusted.afr.createtime - time when the file was created (number of
seconds since epoch)
trusted.afr.version - number of times a file was edited. (version will
be 1 when its created)
If we consider above volume specification, in case of txt files, it
should have 3 copies.
When open() is called on a file say test.txt, it makes sure that all
the 3 copies are present
on the first 3 children, if not, it will pick the latest version and
update it on the children whereever
it needs to be updated by checking the createtime and version.
* during selfheal, the file with latest createtime with latest version
is considered as the latest version.
* missing files will not be created in more than first 3 children (*txt:3)
* In case a file was found in 4th child too and if it was outdated, it
will be updated too
(this can happen when 1st child was down and the file was created -
during which it
will be created on 2nd 3rd 4th children)
Missing directories are detected and created in lookup calls. A lookup
call is used by
kernel/fuse to find the inode number of directory/file.
More information about the Gluster-devel
mailing list