[Gluster-devel] Gluster Sharding and Geo-replication

Shyam srangana at redhat.com
Wed Sep 2 17:43:55 UTC 2015


On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:
>
>
> ------------------------------------------------------------------------
>
>     *From: *"Shyam" <srangana at redhat.com>
>     *To: *"Aravinda" <avishwan at redhat.com>, "Gluster Devel"
>     <gluster-devel at gluster.org>
>     *Sent: *Wednesday, September 2, 2015 8:09:55 PM
>     *Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication
>
>     On 09/02/2015 03:12 AM, Aravinda wrote:
>      > Geo-replication and Sharding Team today discussed about the approach
>      > to make Sharding aware Geo-replication. Details are as below
>      >
>      > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur
>      >
>      > - Both Master and Slave Volumes should be Sharded Volumes with same
>      >    configurations.
>
>     If I am not mistaken, geo-rep supports replicating to a non-gluster
>     local FS at the slave end. Is this correct? If so, would this
>     limitation
>     not make that problematic?
>
>     When you state *same configuration*, I assume you mean the sharding
>     configuration, not the volume graph, right?
>
> That is correct. The only requirement is for the slave to have shard
> translator (for, someone needs to present aggregated view of the file to
> the READers on the slave).
> Also the shard-block-size needs to be kept same between master and
> slave. Rest of the configuration (like the number of subvols of DHT/AFR)
> can vary across master and slave.

Do we need to have the sharded block size the same? As I assume the file 
carries an xattr that contains the size it is sharded with 
(trusted.glusterfs.shard.block-size), so if this is synced across, it 
would do. If this is true, what it would mean is that "a sharded volume 
needs a shard supported slave to ge-rep to".

>
> -Krutika
>
>
>
>      > - In Changelog record changes related to Sharded files also. Just
>     like
>      >    any regular files.
>      > - Sharding should allow Geo-rep to list/read/write Sharding internal
>      >    Xattrs if Client PID is gsyncd(-1)
>      > - Sharding should allow read/write of Sharded files(that is in
>     .shards
>      >    directory) if Client PID is GSYNCD
>      > - Sharding should return actual file instead of returning the
>      >    aggregated content when the Main file is requested(Client PID
>      >    GSYNCD)
>      >
>      > For example, a file f1 is created with GFID G1.
>      >
>      > When the file grows it gets sharded into chunks(say 5 chunks).
>      >
>      >      f1   G1
>      >      .shards/G1.1   G2
>      >      .shards/G1.2   G3
>      >      .shards/G1.3   G4
>      >      .shards/G1.4   G5
>      >
>      > In Changelog, this is recorded as 5 different files as below
>      >
>      >      CREATE G1 f1
>      >      DATA G1
>      >      META G1
>      >      CREATE G2 PGS/G1.1
>      >      DATA G2
>      >      META G1
>      >      CREATE G3 PGS/G1.2
>      >      DATA G3
>      >      META G1
>      >      CREATE G4 PGS/G1.3
>      >      DATA G4
>      >      META G1
>      >      CREATE G5 PGS/G1.4
>      >      DATA G5
>      >      META G1
>      >
>      > Where PGS is GFID of .shards directory.
>      >
>      > Geo-rep will create these files independently in Slave Volume and
>      > syncs Xattrs of G1. Data can be read only when all the chunks are
>      > synced to Slave Volume. Data can be read partially if main/first file
>      > and some of the chunks synced to Slave.
>      >
>      > Please add if I missed anything. C & S Welcome.
>      >
>      > regards
>      > Aravinda
>      >
>      > On 08/11/2015 04:36 PM, Aravinda wrote:
>      >> Hi,
>      >>
>      >> We are thinking different approaches to add support in
>     Geo-replication
>      >> for Sharded Gluster Volumes[1]
>      >>
>      >> *Approach 1: Geo-rep: Sync Full file*
>      >>    - In Changelog only record main file details in the same brick
>      >> where it is created
>      >>    - Record as DATA in Changelog whenever any addition/changes
>     to the
>      >> sharded file
>      >>    - Geo-rep rsync will do checksum as a full file from mount and
>      >> syncs as new file
>      >>    - Slave side sharding is managed by Slave Volume
>      >> *Approach 2: Geo-rep: Sync sharded file separately*
>      >>    - Geo-rep rsync will do checksum for sharded files only
>      >>    - Geo-rep syncs each sharded files independently as new files
>      >>    - [UNKNOWN] Sync internal xattrs(file size and block count)
>     in the
>      >> main sharded file to Slave Volume to maintain the same state as
>     in Master.
>      >>    - Sharding translator to allow file creation under .shards
>     dir for
>      >> gsyncd. that is Parent GFID is .shards directory
>      >>    - If sharded files are modified during Geo-rep run may end up
>     stale
>      >> data in Slave.
>      >>    - Files on Slave Volume may not be readable unless all sharded
>      >> files sync to Slave(Each bricks in Master independently sync
>     files to
>      >> slave)
>      >>
>      >> First approach looks more clean, but we have to analize the Rsync
>      >> checksum performance on big files(Sharded in backend, accessed
>     as one
>      >> big file from rsync)
>      >>
>      >> Let us know your thoughts. Thanks
>      >>
>      >> Ref:
>      >> [1]
>      >>
>     http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
>      >> --
>      >> regards
>      >> Aravinda
>      >>
>      >>
>      >> _______________________________________________
>      >> Gluster-devel mailing list
>      >> Gluster-devel at gluster.org
>      >> http://www.gluster.org/mailman/listinfo/gluster-devel
>      >
>      >
>      >
>      > _______________________________________________
>      > Gluster-devel mailing list
>      > Gluster-devel at gluster.org
>      > http://www.gluster.org/mailman/listinfo/gluster-devel
>      >
>     _______________________________________________
>     Gluster-devel mailing list
>     Gluster-devel at gluster.org
>     http://www.gluster.org/mailman/listinfo/gluster-devel
>
>


More information about the Gluster-devel mailing list