[Gluster-devel] Rebalance data migration and corruption

Joe Julian joe at julianfamily.org
Mon Feb 8 06:50:27 UTC 2016


Is this in current release versions?

On 02/07/2016 07:43 PM, Shyam wrote:
> On 02/06/2016 06:36 PM, Raghavendra Gowdappa wrote:
>>
>>
>> ----- Original Message -----
>>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>> To: "Sakshi Bansal" <sabansal at redhat.com>, "Susant Palai" 
>>> <spalai at redhat.com>
>>> Cc: "Gluster Devel" <gluster-devel at gluster.org>, "Nithya 
>>> Balachandran" <nbalacha at redhat.com>, "Shyamsundar
>>> Ranganathan" <srangana at redhat.com>
>>> Sent: Friday, February 5, 2016 4:32:40 PM
>>> Subject: Re: Rebalance data migration and corruption
>>>
>>> +gluster-devel
>>>
>>>>
>>>> Hi Sakshi/Susant,
>>>>
>>>> - There is a data corruption issue in migration code. Rebalance 
>>>> process,
>>>>    1. Reads data from src
>>>>    2. Writes (say w1) it to dst
>>>>
>>>>    However, 1 and 2 are not atomic, so another write (say w2) to 
>>>> same region
>>>>    can happen between 1. But these two writes can reach dst in the 
>>>> order
>>>>    (w2,
>>>>    w1) resulting in a subtle corruption. This issue is not fixed 
>>>> yet and can
>>>>    cause subtle data corruptions. The fix is simple and involves 
>>>> rebalance
>>>>    process acquiring a mandatory lock to make 1 and 2 atomic.
>>>
>>> We can make use of compound fop framework to make sure we don't 
>>> suffer a
>>> significant performance hit. Following will be the sequence of 
>>> operations
>>> done by rebalance process:
>>>
>>> 1. issues a compound (mandatory lock, read) operation on src.
>>> 2. writes this data to dst.
>>> 3. issues unlock of lock acquired in 1.
>>>
>>> Please co-ordinate with Anuradha for implementation of this compound 
>>> fop.
>>>
>>> Following are the issues I see with this approach:
>>> 1. features/locks provides mandatory lock functionality only for 
>>> posix-locks
>>> (flock and fcntl based locks). So, mandatory locks will be 
>>> posix-locks which
>>> will conflict with locks held by application. So, if an application 
>>> has held
>>> an fcntl/flock, migration cannot proceed.
>>
>> We can implement a "special" domain for mandatory internal locks. 
>> These locks will behave similar to posix mandatory locks in that 
>> conflicting fops (like write, read) are blocked/failed if they are 
>> done while a lock is held.
>>
>>> 2. data migration will be less efficient because of an extra unlock 
>>> (with
>>> compound lock + read) or extra lock and unlock (for non-compound fop 
>>> based
>>> implementation) for every read it does from src.
>>
>> Can we use delegations here? Rebalance process can acquire a 
>> mandatory-write-delegation (an exclusive lock with a functionality 
>> that delegation is recalled when a write operation happens). In that 
>> case rebalance process, can do something like:
>>
>> 1. Acquire a read delegation for entire file.
>> 2. Migrate the entire file.
>> 3. Remove/unlock/give-back the delegation it has acquired.
>>
>> If a recall is issued from brick (when a write happens from mount), 
>> it completes the current write to dst (or throws away the read from 
>> src) to maintain atomicity. Before doing next set of (read, src) and 
>> (write, dst) tries to reacquire lock.
>
> With delegations this simplifies the normal path, when a file is 
> exclusively handled by rebalance. It also improves the case where a 
> client and rebalance are conflicting on a file, to degrade to 
> mandatory locks by either parties.
>
> I would prefer we take the delegation route for such needs in the future.
>
>>
>> @Soumyak, can something like this be done with delegations?
>>
>> @Pranith,
>> Afr does transactions for writing to its subvols. Can you suggest any 
>> optimizations here so that rebalance process can have a transaction 
>> for (read, src) and (write, dst) with minimal performance overhead?
>>
>> regards,
>> Raghavendra.
>>
>>>
>>> Comments?
>>>
>>>>
>>>> regards,
>>>> Raghavendra.
>>>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel



More information about the Gluster-devel mailing list