[Gluster-devel] Rebalance data migration and corruption
Raghavendra Gowdappa
rgowdapp at redhat.com
Mon Feb 8 08:18:31 UTC 2016
----- Original Message -----
> From: "Joe Julian" <joe at julianfamily.org>
> To: gluster-devel at gluster.org
> Sent: Monday, February 8, 2016 12:20:27 PM
> Subject: Re: [Gluster-devel] Rebalance data migration and corruption
>
> Is this in current release versions?
Yes. This bug is present in currently released versions. However, it can happen only if writes from application are happening to a file when it is being migrated. So, vaguely one can say probability is less.
>
> On 02/07/2016 07:43 PM, Shyam wrote:
> > On 02/06/2016 06:36 PM, Raghavendra Gowdappa wrote:
> >>
> >>
> >> ----- Original Message -----
> >>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> >>> To: "Sakshi Bansal" <sabansal at redhat.com>, "Susant Palai"
> >>> <spalai at redhat.com>
> >>> Cc: "Gluster Devel" <gluster-devel at gluster.org>, "Nithya
> >>> Balachandran" <nbalacha at redhat.com>, "Shyamsundar
> >>> Ranganathan" <srangana at redhat.com>
> >>> Sent: Friday, February 5, 2016 4:32:40 PM
> >>> Subject: Re: Rebalance data migration and corruption
> >>>
> >>> +gluster-devel
> >>>
> >>>>
> >>>> Hi Sakshi/Susant,
> >>>>
> >>>> - There is a data corruption issue in migration code. Rebalance
> >>>> process,
> >>>> 1. Reads data from src
> >>>> 2. Writes (say w1) it to dst
> >>>>
> >>>> However, 1 and 2 are not atomic, so another write (say w2) to
> >>>> same region
> >>>> can happen between 1. But these two writes can reach dst in the
> >>>> order
> >>>> (w2,
> >>>> w1) resulting in a subtle corruption. This issue is not fixed
> >>>> yet and can
> >>>> cause subtle data corruptions. The fix is simple and involves
> >>>> rebalance
> >>>> process acquiring a mandatory lock to make 1 and 2 atomic.
> >>>
> >>> We can make use of compound fop framework to make sure we don't
> >>> suffer a
> >>> significant performance hit. Following will be the sequence of
> >>> operations
> >>> done by rebalance process:
> >>>
> >>> 1. issues a compound (mandatory lock, read) operation on src.
> >>> 2. writes this data to dst.
> >>> 3. issues unlock of lock acquired in 1.
> >>>
> >>> Please co-ordinate with Anuradha for implementation of this compound
> >>> fop.
> >>>
> >>> Following are the issues I see with this approach:
> >>> 1. features/locks provides mandatory lock functionality only for
> >>> posix-locks
> >>> (flock and fcntl based locks). So, mandatory locks will be
> >>> posix-locks which
> >>> will conflict with locks held by application. So, if an application
> >>> has held
> >>> an fcntl/flock, migration cannot proceed.
> >>
> >> We can implement a "special" domain for mandatory internal locks.
> >> These locks will behave similar to posix mandatory locks in that
> >> conflicting fops (like write, read) are blocked/failed if they are
> >> done while a lock is held.
> >>
> >>> 2. data migration will be less efficient because of an extra unlock
> >>> (with
> >>> compound lock + read) or extra lock and unlock (for non-compound fop
> >>> based
> >>> implementation) for every read it does from src.
> >>
> >> Can we use delegations here? Rebalance process can acquire a
> >> mandatory-write-delegation (an exclusive lock with a functionality
> >> that delegation is recalled when a write operation happens). In that
> >> case rebalance process, can do something like:
> >>
> >> 1. Acquire a read delegation for entire file.
> >> 2. Migrate the entire file.
> >> 3. Remove/unlock/give-back the delegation it has acquired.
> >>
> >> If a recall is issued from brick (when a write happens from mount),
> >> it completes the current write to dst (or throws away the read from
> >> src) to maintain atomicity. Before doing next set of (read, src) and
> >> (write, dst) tries to reacquire lock.
> >
> > With delegations this simplifies the normal path, when a file is
> > exclusively handled by rebalance. It also improves the case where a
> > client and rebalance are conflicting on a file, to degrade to
> > mandatory locks by either parties.
> >
> > I would prefer we take the delegation route for such needs in the future.
> >
> >>
> >> @Soumyak, can something like this be done with delegations?
> >>
> >> @Pranith,
> >> Afr does transactions for writing to its subvols. Can you suggest any
> >> optimizations here so that rebalance process can have a transaction
> >> for (read, src) and (write, dst) with minimal performance overhead?
> >>
> >> regards,
> >> Raghavendra.
> >>
> >>>
> >>> Comments?
> >>>
> >>>>
> >>>> regards,
> >>>> Raghavendra.
> >>>
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
More information about the Gluster-devel
mailing list