[Gluster-devel] Gluster Sharding and Geo-replication

Aravinda avishwan at redhat.com
Wed Sep 2 15:20:24 UTC 2015


On 09/02/2015 08:22 PM, Venky Shankar wrote:
> On Wed, Sep 2, 2015 at 12:42 PM, Aravinda <avishwan at redhat.com> wrote:
>> Geo-replication and Sharding Team today discussed about the approach
>> to make Sharding aware Geo-replication. Details are as below
>>
>> Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur
>>
>> - Both Master and Slave Volumes should be Sharded Volumes with same
>>    configurations.
>> - In Changelog record changes related to Sharded files also. Just like
>>    any regular files.
>> - Sharding should allow Geo-rep to list/read/write Sharding internal
>>    Xattrs if Client PID is gsyncd(-1)
>> - Sharding should allow read/write of Sharded files(that is in .shards
>>    directory) if Client PID is GSYNCD
>> - Sharding should return actual file instead of returning the
>>    aggregated content when the Main file is requested(Client PID
>>    GSYNCD)
>>
>> For example, a file f1 is created with GFID G1.
>>
>> When the file grows it gets sharded into chunks(say 5 chunks).
>>
>>      f1   G1
>>      .shards/G1.1   G2
>>      .shards/G1.2   G3
>>      .shards/G1.3   G4
>>      .shards/G1.4   G5
>>
>> In Changelog, this is recorded as 5 different files as below
>>
>>      CREATE G1 f1
>>      DATA G1
>>      META G1
>>      CREATE G2 PGS/G1.1
>>      DATA G2
>>      META G1
>>      CREATE G3 PGS/G1.2
>>      DATA G3
>>      META G1
>>      CREATE G4 PGS/G1.3
>>      DATA G4
>>      META G1
>>      CREATE G5 PGS/G1.4
>>      DATA G5
>>      META G1
>>
>> Where PGS is GFID of .shards directory.
>>
>> Geo-rep will create these files independently in Slave Volume and
>> syncs Xattrs of G1. Data can be read only when all the chunks are
>> synced to Slave Volume. Data can be read partially if main/first file
>> and some of the chunks synced to Slave.
> So, before replicating data to the salve, all shards needs to be created there?
No. each files will be synced independently. But for reading complete 
file all the shards should be present, else partial data is read.
>
>> Please add if I missed anything. C & S Welcome.
>>
>> regards
>> Aravinda
>>
>> On 08/11/2015 04:36 PM, Aravinda wrote:
>>
>> Hi,
>>
>> We are thinking different approaches to add support in Geo-replication for
>> Sharded Gluster Volumes[1]
>>
>> Approach 1: Geo-rep: Sync Full file
>>     - In Changelog only record main file details in the same brick where it
>> is created
>>     - Record as DATA in Changelog whenever any addition/changes to the
>> sharded file
>>     - Geo-rep rsync will do checksum as a full file from mount and syncs as
>> new file
>>     - Slave side sharding is managed by Slave Volume
>>
>> Approach 2: Geo-rep: Sync sharded file separately
>>     - Geo-rep rsync will do checksum for sharded files only
>>     - Geo-rep syncs each sharded files independently as new files
>>     - [UNKNOWN] Sync internal xattrs(file size and block count) in the main
>> sharded file to Slave Volume to maintain the same state as in Master.
>>     - Sharding translator to allow file creation under .shards dir for
>> gsyncd. that is Parent GFID is .shards directory
>>     - If sharded files are modified during Geo-rep run may end up stale data
>> in Slave.
>>     - Files on Slave Volume may not be readable unless all sharded files sync
>> to Slave(Each bricks in Master independently sync files to slave)
>>
>> First approach looks more clean, but we have to analize the Rsync checksum
>> performance on big files(Sharded in backend, accessed as one big file from
>> rsync)
>>
>> Let us know your thoughts. Thanks
>>
>> Ref:
>> [1]
>> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
>>
>> --
>> regards
>> Aravinda
>>
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>

regards
Aravinda



More information about the Gluster-devel mailing list