[Gluster-devel] afr logic
Kevan Benson
kbenson at a-1networks.com
Wed Oct 17 16:38:25 UTC 2007
Alexey Filin wrote:
> Hi Kevan,
>
> consistency of afr'ed files is important question as of failures in
> backend fs too, afr is a medicine against node failures not backend fs
> ones (at least not directly), in the last case files can be changed
> "legally" in bypass glusterfs by fsck after a hw/sw failure and the
> changes have to be handled for corrupted replica, else reading of the
> same file can give different data (especialy for forthcoming load
> balanced read of replicas). Fortunately rsync'ing of original must
> create consistent replica in the case too (if cluster/stripe under afr
> works equally with replicas), unfortunately extended attributes aren't
> rsync'ed (I tested it) what can be required during repairing.
>
> It seems glusterfs could try to handle hw/sw failures in backend fs
> with checksums in extended attributes and checksums are to be
> calculated for file chunks (because one checksum requires full
> recalculation after appending/changing of one byte to/in a gigabyte
> file) in the case glusterfs has to recalculate checksums of all files
> on corrupted fs (may be toooo long, it is the same case with
> rsync'ing) or get list of corrupted files from backend fs in some way
> (e.g. with a flag set by fsck in extended attributes). May be some
> kind of distributed raid is a better solution, first step in the
> direction was done already by cluster/stripe (unfortunately one of
> implementations, DDRaid http://sources.redhat.com/cluster/ddraid/ by
> Daniel Phillips seems to be suspended), perhaps it is too
> computational/network intensive and raid under backend fs is the best
> solution even taking into account disk space overhead.
>
> I'm very interested to hear thoughts about it from glusterfs
> developers to clear my misunderstanding.
The rsync case can probably be handled through a separate find of the
appropriate attributes on the source and set on the target. A simple
bash/perl script could handle this in a few lines.
The fsck case is more interesting, but if you could get fsck to report
file/directory names that have problems and not fix them, it's easy to
pipe that to a script to remove the trusted.afr.version attribute on the
files and then the AFR will heal itself.
Checksums would of course give you much better tracking of corrupted
files, but I imagine the cpu strain and speed decrease would make it
non-feasible.
--
-Kevan Benson
-A-1 Networks
More information about the Gluster-devel
mailing list