[Gluster-devel] Gluster Sharding and Geo-replication

Shyam srangana at redhat.com
Wed Sep 2 14:39:55 UTC 2015


On 09/02/2015 03:12 AM, Aravinda wrote:
> Geo-replication and Sharding Team today discussed about the approach
> to make Sharding aware Geo-replication. Details are as below
>
> Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur
>
> - Both Master and Slave Volumes should be Sharded Volumes with same
>    configurations.

If I am not mistaken, geo-rep supports replicating to a non-gluster 
local FS at the slave end. Is this correct? If so, would this limitation 
not make that problematic?

When you state *same configuration*, I assume you mean the sharding 
configuration, not the volume graph, right?

> - In Changelog record changes related to Sharded files also. Just like
>    any regular files.
> - Sharding should allow Geo-rep to list/read/write Sharding internal
>    Xattrs if Client PID is gsyncd(-1)
> - Sharding should allow read/write of Sharded files(that is in .shards
>    directory) if Client PID is GSYNCD
> - Sharding should return actual file instead of returning the
>    aggregated content when the Main file is requested(Client PID
>    GSYNCD)
>
> For example, a file f1 is created with GFID G1.
>
> When the file grows it gets sharded into chunks(say 5 chunks).
>
>      f1   G1
>      .shards/G1.1   G2
>      .shards/G1.2   G3
>      .shards/G1.3   G4
>      .shards/G1.4   G5
>
> In Changelog, this is recorded as 5 different files as below
>
>      CREATE G1 f1
>      DATA G1
>      META G1
>      CREATE G2 PGS/G1.1
>      DATA G2
>      META G1
>      CREATE G3 PGS/G1.2
>      DATA G3
>      META G1
>      CREATE G4 PGS/G1.3
>      DATA G4
>      META G1
>      CREATE G5 PGS/G1.4
>      DATA G5
>      META G1
>
> Where PGS is GFID of .shards directory.
>
> Geo-rep will create these files independently in Slave Volume and
> syncs Xattrs of G1. Data can be read only when all the chunks are
> synced to Slave Volume. Data can be read partially if main/first file
> and some of the chunks synced to Slave.
>
> Please add if I missed anything. C & S Welcome.
>
> regards
> Aravinda
>
> On 08/11/2015 04:36 PM, Aravinda wrote:
>> Hi,
>>
>> We are thinking different approaches to add support in Geo-replication
>> for Sharded Gluster Volumes[1]
>>
>> *Approach 1: Geo-rep: Sync Full file*
>>    - In Changelog only record main file details in the same brick
>> where it is created
>>    - Record as DATA in Changelog whenever any addition/changes to the
>> sharded file
>>    - Geo-rep rsync will do checksum as a full file from mount and
>> syncs as new file
>>    - Slave side sharding is managed by Slave Volume
>> *Approach 2: Geo-rep: Sync sharded file separately*
>>    - Geo-rep rsync will do checksum for sharded files only
>>    - Geo-rep syncs each sharded files independently as new files
>>    - [UNKNOWN] Sync internal xattrs(file size and block count) in the
>> main sharded file to Slave Volume to maintain the same state as in Master.
>>    - Sharding translator to allow file creation under .shards dir for
>> gsyncd. that is Parent GFID is .shards directory
>>    - If sharded files are modified during Geo-rep run may end up stale
>> data in Slave.
>>    - Files on Slave Volume may not be readable unless all sharded
>> files sync to Slave(Each bricks in Master independently sync files to
>> slave)
>>
>> First approach looks more clean, but we have to analize the Rsync
>> checksum performance on big files(Sharded in backend, accessed as one
>> big file from rsync)
>>
>> Let us know your thoughts. Thanks
>>
>> Ref:
>> [1]
>> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
>> --
>> regards
>> Aravinda
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>


More information about the Gluster-devel mailing list