[Gluster-devel] Files erased while one brick was down in AFR returns from the afterlife

Daniel van Ham Colchete daniel.colchete at gmail.com
Wed Jul 4 14:52:49 UTC 2007


Avati,

that's very good to hear! Thank you...

I studied the function afr_selfheal_getxattr_cbk() from patch 268 a little
bit to see how the algorithm really works, and I don't think it's crappy
(there is a comment their saying so).

I thought about the 'ugly' for the last hour but for every problem I could
think of there is a solution. But it could be very complex.

If you erase a subdirectory while one of the bricks is down and try to open
a file inside it you would have to check the versioning of two directories
before deciding ether to duplicate or erase the file. It would have to be a
recursive algorithm that could be very costly.

The creating of files doesn't open() the dir, so you would have to add
another call to change the version everytime something changes inside a dir
(files, modes, permissions, ...).

The self-heal feature would have to expand from 'open()' to another
functions as well because, otherwise, unless you try to open the file that
shouldn't exist, you will still see it when listing the dir.

But this surely can be done and I trust it will get really good!

For now I'm thinking of AFRing before Gluster, at the device level, with
DRBD+Heartbeat. It's very messy, but I believe it will work.

Thanks for the news!

Best regards,
Daniel Colchete


On 7/4/07, Anand Avati <avati at zresearch.com> wrote:
>
> Daniel,
>   Nice suggestion! versioning directories makes sense. Each subdirectory
> can be treated a seperate 'problem' and be solved nicely. a given
> directory's version will only imply existance or non-existance of its
> subdirectory, and not the contents of the subdirectory. I dont see why it
> can get messy? This will (if at all) get into the first release post
> 1.3-STABLE.
>
> thanks,
> avati
>
> 2007/7/3, Daniel van Ham Colchete < daniel.colchete at gmail.com>:
> >
> > Hi Krishna!
> >
> > On 7/2/07, Krishna Srinivas <krishna at zresearch.com> wrote:
> > >
> > > Hi Daniel,
> > >
> > > That case is not handled yet because of lack of time. Since its
> > > not a risky thing to bring back a deleted file, it is in the TODO list
> > > and yet to be fixed.
> >
> >
> > I know I'm not doing anything to solve the problem but this "not a
> > risky"
> > unfortunately does not apply to me... One of the things I'll be doing
> > with
> > GlusterFS is hosting e-mails. If I have to put one of the bricks in
> > maintenance (kernel/security update or something else) them every e-mail
> > downloaded or deleted by my users will reappear.
> >
> > But if this will delay the release of version 1.3 them I also think it's
> > in
> > the best interest of the project to delay this fix...
> >
> > I'm a little bit worried because GlusterFS is, by far, the best option
> > for
> > every storage application I need (email, a parallel database I'm
> > planing,
> > web host, etc...), and soon (a few days) I'll be in production...
> >
> > Versioning the directory is an option. Other things we can do -
> > > journal the deletion or bring namespace awareness to AFR.
> >
> >
> > Thinking about my suggestion after sending the e-mail, I though that
> > versioning the directory might not be the best option. When I started
> > thinking about directory/subdirectories deletion/recreation things got
> > too
> > ugly and the beautiful simplicity of GlusterFS was lost.
> >
> > IMO namespace awareness to the AFR might bring a "chicken and the egg"
> > problem when planing to have no single point of failure in the project.
> > Can
> > we use AFR to mirror the AFR's namespace? If yes, wouldn't it bring the
> > same
> > problem?
> >
> > Journaling seems nice, but who reads the journal? And when? Where would
> > it
> > be written? To every answer of those question I can think of performance
> > and
> > other problems the would arise...
> >
> > This really requires a lot of thinking and also requires a lot more
> > understanding of everyone's storage needs than I have. I'm deeply sorry
> > I
> > can't suggest a solution for you. May be I'm so tied to the problem that
> > I
> > can't see the solution :).
> >
> > Krishna
> > >
> > >
> > When you start working with this, please let-me know.
> >
> > Best regards,
> > Daniel
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at nongnu.org
> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >
>
>
>
> --
> Anand V. Avati



More information about the Gluster-devel mailing list