[Gluster-devel] Gluster Sharding and Geo-replication
Shyam
srangana at redhat.com
Thu Sep 3 14:44:07 UTC 2015
On 09/03/2015 02:43 AM, Krutika Dhananjay wrote:
>
>
> ------------------------------------------------------------------------
>
> *From: *"Shyam" <srangana at redhat.com>
> *To: *"Krutika Dhananjay" <kdhananj at redhat.com>
> *Cc: *"Aravinda" <avishwan at redhat.com>, "Gluster Devel"
> <gluster-devel at gluster.org>
> *Sent: *Wednesday, September 2, 2015 11:13:55 PM
> *Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication
>
> On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:
> >
> >
> >
> ------------------------------------------------------------------------
> >
> > *From: *"Shyam" <srangana at redhat.com>
> > *To: *"Aravinda" <avishwan at redhat.com>, "Gluster Devel"
> > <gluster-devel at gluster.org>
> > *Sent: *Wednesday, September 2, 2015 8:09:55 PM
> > *Subject: *Re: [Gluster-devel] Gluster Sharding and
> Geo-replication
> >
> > On 09/02/2015 03:12 AM, Aravinda wrote:
> > > Geo-replication and Sharding Team today discussed about
> the approach
> > > to make Sharding aware Geo-replication. Details are as below
> > >
> > > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja,
> Vijay Bellur
> > >
> > > - Both Master and Slave Volumes should be Sharded Volumes
> with same
> > > configurations.
> >
> > If I am not mistaken, geo-rep supports replicating to a
> non-gluster
> > local FS at the slave end. Is this correct? If so, would this
> > limitation
> > not make that problematic?
> >
> > When you state *same configuration*, I assume you mean the
> sharding
> > configuration, not the volume graph, right?
> >
> > That is correct. The only requirement is for the slave to have shard
> > translator (for, someone needs to present aggregated view of the
> file to
> > the READers on the slave).
> > Also the shard-block-size needs to be kept same between master and
> > slave. Rest of the configuration (like the number of subvols of
> DHT/AFR)
> > can vary across master and slave.
>
> Do we need to have the sharded block size the same? As I assume the
> file
> carries an xattr that contains the size it is sharded with
> (trusted.glusterfs.shard.block-size), so if this is synced across, it
> would do. If this is true, what it would mean is that "a sharded volume
> needs a shard supported slave to ge-rep to".
>
> Yep. Even I feel it should probably not be necessary to enforce
> same-shard-size-everywhere as long as shard translator on the slave
> takes care not to further "shard" the individual shards gsyncD would
> write to, on the slave volume.
> This is especially true if different files/images/vdisks on the master
> volume are associated with different block sizes.
> This logic has to be built into the shard translator based on parameters
> (client-pid, parent directory of the file being written to).
> What this means is that shard-block-size attribute on the slave would
> essentially be a don't-care parameter. I need to give all this some more
> thought though.
Understood thanks.
>
> -Krutika
>
> >
> > -Krutika
> >
> >
> >
> > > - In Changelog record changes related to Sharded files
> also. Just
> > like
> > > any regular files.
> > > - Sharding should allow Geo-rep to list/read/write
> Sharding internal
> > > Xattrs if Client PID is gsyncd(-1)
> > > - Sharding should allow read/write of Sharded files(that is in
> > .shards
> > > directory) if Client PID is GSYNCD
> > > - Sharding should return actual file instead of returning the
> > > aggregated content when the Main file is
> requested(Client PID
> > > GSYNCD)
> > >
> > > For example, a file f1 is created with GFID G1.
> > >
> > > When the file grows it gets sharded into chunks(say 5 chunks).
> > >
> > > f1 G1
> > > .shards/G1.1 G2
> > > .shards/G1.2 G3
> > > .shards/G1.3 G4
> > > .shards/G1.4 G5
> > >
> > > In Changelog, this is recorded as 5 different files as below
> > >
> > > CREATE G1 f1
> > > DATA G1
> > > META G1
> > > CREATE G2 PGS/G1.1
> > > DATA G2
> > > META G1
> > > CREATE G3 PGS/G1.2
> > > DATA G3
> > > META G1
> > > CREATE G4 PGS/G1.3
> > > DATA G4
> > > META G1
> > > CREATE G5 PGS/G1.4
> > > DATA G5
> > > META G1
> > >
> > > Where PGS is GFID of .shards directory.
> > >
> > > Geo-rep will create these files independently in Slave
> Volume and
> > > syncs Xattrs of G1. Data can be read only when all the
> chunks are
> > > synced to Slave Volume. Data can be read partially if
> main/first file
> > > and some of the chunks synced to Slave.
> > >
> > > Please add if I missed anything. C & S Welcome.
> > >
> > > regards
> > > Aravinda
> > >
> > > On 08/11/2015 04:36 PM, Aravinda wrote:
> > >> Hi,
> > >>
> > >> We are thinking different approaches to add support in
> > Geo-replication
> > >> for Sharded Gluster Volumes[1]
> > >>
> > >> *Approach 1: Geo-rep: Sync Full file*
> > >> - In Changelog only record main file details in the
> same brick
> > >> where it is created
> > >> - Record as DATA in Changelog whenever any
> addition/changes
> > to the
> > >> sharded file
> > >> - Geo-rep rsync will do checksum as a full file from
> mount and
> > >> syncs as new file
> > >> - Slave side sharding is managed by Slave Volume
> > >> *Approach 2: Geo-rep: Sync sharded file separately*
> > >> - Geo-rep rsync will do checksum for sharded files only
> > >> - Geo-rep syncs each sharded files independently as
> new files
> > >> - [UNKNOWN] Sync internal xattrs(file size and block
> count)
> > in the
> > >> main sharded file to Slave Volume to maintain the same
> state as
> > in Master.
> > >> - Sharding translator to allow file creation under .shards
> > dir for
> > >> gsyncd. that is Parent GFID is .shards directory
> > >> - If sharded files are modified during Geo-rep run may
> end up
> > stale
> > >> data in Slave.
> > >> - Files on Slave Volume may not be readable unless all
> sharded
> > >> files sync to Slave(Each bricks in Master independently sync
> > files to
> > >> slave)
> > >>
> > >> First approach looks more clean, but we have to analize
> the Rsync
> > >> checksum performance on big files(Sharded in backend,
> accessed
> > as one
> > >> big file from rsync)
> > >>
> > >> Let us know your thoughts. Thanks
> > >>
> > >> Ref:
> > >> [1]
> > >>
> >
> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
> > >> --
> > >> regards
> > >> Aravinda
> > >>
> > >>
> > >> _______________________________________________
> > >> Gluster-devel mailing list
> > >> Gluster-devel at gluster.org
> > >> http://www.gluster.org/mailman/listinfo/gluster-devel
> > >
> > >
> > >
> > > _______________________________________________
> > > Gluster-devel mailing list
> > > Gluster-devel at gluster.org
> > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> >
> >
>
>
More information about the Gluster-devel
mailing list