[Gluster-devel] Rebalance data migration and corruption

Mon Feb 8 15:38:45 UTC 2016

On 02/08/2016 12:18 AM, Raghavendra Gowdappa wrote:
>
> ----- Original Message -----
>> From: "Joe Julian" <joe at julianfamily.org>
>> To: gluster-devel at gluster.org
>> Sent: Monday, February 8, 2016 12:20:27 PM
>> Subject: Re: [Gluster-devel] Rebalance data migration and corruption
>>
>> Is this in current release versions?
> Yes. This bug is present in currently released versions. However, it can happen only if writes from application are happening to a file when it is being migrated. So, vaguely one can say probability is less.

Probability is quite high when the volume is used for VM images, which 
many are.

>
>> On 02/07/2016 07:43 PM, Shyam wrote:
>>> On 02/06/2016 06:36 PM, Raghavendra Gowdappa wrote:
>>>>
>>>> ----- Original Message -----
>>>>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>>>> To: "Sakshi Bansal" <sabansal at redhat.com>, "Susant Palai"
>>>>> <spalai at redhat.com>
>>>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>, "Nithya
>>>>> Balachandran" <nbalacha at redhat.com>, "Shyamsundar
>>>>> Ranganathan" <srangana at redhat.com>
>>>>> Sent: Friday, February 5, 2016 4:32:40 PM
>>>>> Subject: Re: Rebalance data migration and corruption
>>>>>
>>>>> +gluster-devel
>>>>>
>>>>>> Hi Sakshi/Susant,
>>>>>>
>>>>>> - There is a data corruption issue in migration code. Rebalance
>>>>>> process,
>>>>>>     1. Reads data from src
>>>>>>     2. Writes (say w1) it to dst
>>>>>>
>>>>>>     However, 1 and 2 are not atomic, so another write (say w2) to
>>>>>> same region
>>>>>>     can happen between 1. But these two writes can reach dst in the
>>>>>> order
>>>>>>     (w2,
>>>>>>     w1) resulting in a subtle corruption. This issue is not fixed
>>>>>> yet and can
>>>>>>     cause subtle data corruptions. The fix is simple and involves
>>>>>> rebalance
>>>>>>     process acquiring a mandatory lock to make 1 and 2 atomic.
>>>>> We can make use of compound fop framework to make sure we don't
>>>>> suffer a
>>>>> significant performance hit. Following will be the sequence of
>>>>> operations
>>>>> done by rebalance process:
>>>>>
>>>>> 1. issues a compound (mandatory lock, read) operation on src.
>>>>> 2. writes this data to dst.
>>>>> 3. issues unlock of lock acquired in 1.
>>>>>
>>>>> Please co-ordinate with Anuradha for implementation of this compound
>>>>> fop.
>>>>>
>>>>> Following are the issues I see with this approach:
>>>>> 1. features/locks provides mandatory lock functionality only for
>>>>> posix-locks
>>>>> (flock and fcntl based locks). So, mandatory locks will be
>>>>> posix-locks which
>>>>> will conflict with locks held by application. So, if an application
>>>>> has held
>>>>> an fcntl/flock, migration cannot proceed.
>>>> We can implement a "special" domain for mandatory internal locks.
>>>> These locks will behave similar to posix mandatory locks in that
>>>> conflicting fops (like write, read) are blocked/failed if they are
>>>> done while a lock is held.
>>>>
>>>>> 2. data migration will be less efficient because of an extra unlock
>>>>> (with
>>>>> compound lock + read) or extra lock and unlock (for non-compound fop
>>>>> based
>>>>> implementation) for every read it does from src.
>>>> Can we use delegations here? Rebalance process can acquire a
>>>> mandatory-write-delegation (an exclusive lock with a functionality
>>>> that delegation is recalled when a write operation happens). In that
>>>> case rebalance process, can do something like:
>>>>
>>>> 1. Acquire a read delegation for entire file.
>>>> 2. Migrate the entire file.
>>>> 3. Remove/unlock/give-back the delegation it has acquired.
>>>>
>>>> If a recall is issued from brick (when a write happens from mount),
>>>> it completes the current write to dst (or throws away the read from
>>>> src) to maintain atomicity. Before doing next set of (read, src) and
>>>> (write, dst) tries to reacquire lock.
>>> With delegations this simplifies the normal path, when a file is
>>> exclusively handled by rebalance. It also improves the case where a
>>> client and rebalance are conflicting on a file, to degrade to
>>> mandatory locks by either parties.
>>>
>>> I would prefer we take the delegation route for such needs in the future.
>>>
>>>> @Soumyak, can something like this be done with delegations?
>>>>
>>>> @Pranith,
>>>> Afr does transactions for writing to its subvols. Can you suggest any
>>>> optimizations here so that rebalance process can have a transaction
>>>> for (read, src) and (write, dst) with minimal performance overhead?
>>>>
>>>> regards,
>>>> Raghavendra.
>>>>
>>>>> Comments?
>>>>>
>>>>>> regards,
>>>>>> Raghavendra.
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>