[Gluster-users] Per-client prefered server?

Krutika Dhananjay kdhananj at redhat.com
Fri Mar 4 09:56:37 UTC 2016


Hi, 

So up until in 3.5.x, there was a read child selection mode called 'first responder' where the brick that responds first for a particular client becomes the read child. 
After the replication module was rewritten for the most part from 3.6.0, this mode was removed. 

There exists a workaround, though. Could you share the output of `gluster volume info <VOL>`? 

-Krutika 

----- Original Message -----

> From: "Yannick Perret" <yannick.perret at liris.cnrs.fr>
> To: "Saravanakumar Arumugam" <sarumuga at redhat.com>, gluster-users at gluster.org
> Sent: Friday, March 4, 2016 2:43:16 PM
> Subject: Re: [Gluster-users] Per-client prefered server?

> Le 03/03/2016 15:17, Saravanakumar Arumugam a écrit :

> > On 03/03/2016 05:38 PM, Yannick Perret wrote:
> 

> > > Hello,
> > 
> 

> > > I can't find if it is possible to set a prefered server on a per-client
> > > basis
> > > for replica volumes, so I ask the question here.
> > 
> 

> > > The context: we have 2 storage servers, each in one building. We also
> > > have
> > > several virtual machines on each building, and they can migrate from one
> > > building to an other (depending on load, maintenance…).
> > 
> 

> > > So (for testing at this time) I setup a x2 replica volume, one replica on
> > > each storage server of course. As most of our volumes are "many reads -
> > > few
> > > writes" it would be better for bandwidth that each client uses the
> > > "nearest"
> > > storage server (local building switch) - for reading, of course. The 2
> > > buildings have a good netlink but we prefer to minimize - when not needed
> > > -
> > > data transferts beetween them (this link is shared).
> > 
> 

> > > Can you see a solution for this kind of tuning? As far as I understand
> > > geo-replica is not really what I need, no?
> > 
> 

> > Yes, geo-replication "cannot" be used as you wish to carry out "write"
> > operation on Slave side.
> 

> Ok, thanks. I was pretty sure it was the case but I prefer to ask.

> > > It exists "cluster.read-subvolume" option of course but we can have
> > > clients
> > > on both building so a per-volume option is not what we need. An
> > > per-client
> > > equivalent of this option should be nice.
> > 
> 

> > > I tested by myself a small patch to perform this - and it seems to work
> > > fine
> > > as far as I can see - but 1. before continuing in this way I would first
> > > check if it exists an other way and 2. I'm not familiar with the whole
> > > code
> > > so I'm not sure that my tests are in the "state-of-the-art" for
> > > glusterfs.
> > 
> 

> > maybe you should share that interesting patch :) and get better feedback
> > about your test case.
> 

> My "patch" is quite simple: I added in afr_read_subvol_select_by_policy()
> (xlators/cluster/afr/src/afr-common.c) a target selection similar to the one
> managed by "read_child" configuration (see patches at the end).

> Of course I also added the definition of this "forced_child" in afr.h, in the
> same way favorite_child or read_child is defined.

> My real problem here is how to tell to client to change its "forced-child"
> value.

> I did this by reading it from a local file (/var/lib/glusterd/forced-child)
> in init() and reconfigure() (from xlators/cluster/afr/src/afr.c). This is
> fine at startup, and when volume configuration changed, but I find the
> sending a SIGUP is not enough because client detects that no change occurs
> and do not call "reconfigure()". So for my tests I modified
> glusterfsd/src/glusterfsd-mgmt.c so that if /var/lib/glusterf/forced-child
> exists it behave as if a configuration change occured (and so calls
> reconfigure() which reload the forced-child value).

> At this point it works as I expected but I think it is possible to handle a
> new forced-child value without calling all reconfigure().
> Moreover this file contains a raw number, it would be better to use a server
> name, which would be converted into index after that.

> If possible it may be better to send the new value using 'gluster' command on
> client? I.e. something like 'gluster volume set client.prefered-server
> SERVER-NAME'?

> Any advices welcome.

> Regards,
> --
> Y.

> <<<
> --- glusterfs-3.6.7/xlators/cluster/afr/src/afr-common.c 2015-11-25
> 12:55:58.000000000 +0100
> +++ glusterfs-3.6.7b/xlators/cluster/afr/src/afr-common.c 2015-12-10
> 14:59:18.898580772 +0100
> @@ -764,10 +764,18 @@
> int i = 0;
> int read_subvol = -1;
> afr_private_t *priv = NULL;
> - afr_read_subvol_args_t local_args = {0,};
> + afr_read_subvol_args_t local_args = {0,};

> priv = this->private;

> +
> + /* if forced-child use it */
> + if ((priv->forced_child >= 0)
> + && (priv->forced_child < priv->child_count)
> + && (readable[priv->forced_child])) {
> + return priv->forced_child;
> + }
> +
> /* first preference - explicitly specified or local subvolume */
> if (priv->read_child >= 0 && readable[priv->read_child])
> return priv->read_child;
> >>>

> <<<
> @@ -83,6 +83,7 @@
> unsigned int hash_mode; /* for when read_child is not set */
> int favorite_child; /* subvolume to be preferred in resolving
> split-brain cases */
> + int forced_child; /* child to use (if possible) */

> gf_boolean_t inodelk_trace;
> gf_boolean_t entrylk_trace;
> >>>

> <<<
> --- glusterfs-3.6.7/xlators/cluster/afr/src/afr.c 2015-11-25
> 12:55:58.000000000 +0100
> +++ glusterfs-3.6.7b/xlators/cluster/afr/src/afr.c 2015-12-10
> 16:34:55.530790442 +0100
> @@ -23,6 +23,7 @@

> struct volume_options options[];

> +
> int32_t
> notify (xlator_t *this, int32_t event,
> void *data, ...)
> @@ -106,9 +107,26 @@
> int ret = -1;
> int index = -1;
> char *qtype = NULL;
> + FILE *prefer = NULL;
> + int i = -1;

> priv = this->private;

> + /* if /var/lib/glusterd/forced-child exists read the content
> + and use it as prefered target for read */
> + priv->forced_child = -1;
> + prefer = fopen("/var/lib/glusterd/forced-child", "r");
> + if (prefer) {
> + if (fscanf(prefer, "%d", &i) == 1) {
> + if ((i >= 0) && (i < priv->child_count)) {
> + priv->forced_child = i;
> + gf_log (this->name, GF_LOG_INFO,
> + "using %d as forced-child", i);
> + }
> + }
> + fclose(prefer);
> + }
> +
> GF_OPTION_RECONF ("afr-dirty-xattr",
> priv->afr_dirty, options, str,
> out);
> @@ -234,6 +252,7 @@
> int read_subvol_index = -1;
> xlator_t *fav_child = NULL;
> char *qtype = NULL;
> + FILE *prefer = NULL;

> if (!this->children) {
> gf_log (this->name, GF_LOG_ERROR,
> @@ -261,6 +280,21 @@

> priv->read_child = -1;

> + /* if /var/lib/glusterd/forced-child exists read the content
> + and use it as prefered target for read */
> + priv->forced_child = -1;
> + prefer = fopen("/var/lib/glusterd/forced-child", "r");
> + if (prefer) {
> + if (fscanf(prefer, "%d", &i) == 1) {
> + if ((i >= 0) && (i < priv->child_count)) {
> + priv->forced_child = i;
> + gf_log (this->name, GF_LOG_INFO,
> + "using %d as forced-child", i);
> + }
> + }
> + fclose(prefer);
> + }
> +
> GF_OPTION_INIT ("afr-dirty-xattr", priv->afr_dirty, str, out);

> GF_OPTION_INIT ("metadata-splitbrain-forced-heal",
> >>>

> <<<
> --- glusterfs-3.6.7/glusterfsd/src/glusterfsd-mgmt.c 2015-11-25
> 12:55:58.000000000 +0100
> +++ glusterfs-3.6.7b/glusterfsd/src/glusterfsd-mgmt.c 2015-12-10
> 16:34:20.530789162 +0100
> @@ -1502,7 +1502,9 @@
> if (size == oldvollen && (memcmp (oldvolfile, rsp.spec, size) == 0)) {
> gf_log (frame->this->name, GF_LOG_INFO,
> "No change in volfile, continuing");
> - goto out;
> + if (access("/var/lib/glusterd/forced-child", R_OK) != 0) {
> + goto out; /* don't skip if exists to re-read forced-child */
> + }
> }

> tmpfp = tmpfile ();
> >>>

> --
> Y.

> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160304/f57cabb1/attachment.html>


More information about the Gluster-users mailing list