[Gluster-devel] Throttling xlator on the bricks
Richard Wareing
rwareing at fb.com
Fri Feb 12 18:43:01 UTC 2016
Hey Ravi,
I'll ping Shreyas about this today. There's also a patch we'll need for multi-threaded SHD to fix the least-pri queuing. The PID of the process wasn't tagged correctly via the call frame in my original patch. The patch below fixes this (for 3.6.3), I didn't see multi-threaded self heal on github/master yet so let me know what branch you need this patch on and I can come up with a clean patch.
Richard
=====
diff --git a/xlators/cluster/afr/src/afr-self-heald.c b/xlators/cluster/afr/src/afr-self-heald.c
index 028010d..b0f6248 100644
--- a/xlators/cluster/afr/src/afr-self-heald.c
+++ b/xlators/cluster/afr/src/afr-self-heald.c
@@ -532,6 +532,9 @@ afr_mt_process_entries_done (int ret, call_frame_t *sync_frame,
pthread_cond_signal (&mt_data->task_done);
}
pthread_mutex_unlock (&mt_data->lock);
+
+ if (task_ctx->frame)
+ AFR_STACK_DESTROY (task_ctx->frame);
GF_FREE (task_ctx);
return 0;
}
@@ -787,6 +790,7 @@ _afr_mt_create_process_entries_task (xlator_t *this,
int ret = -1;
afr_mt_process_entries_task_ctx_t *task_ctx;
afr_mt_data_t *mt_data;
+ call_frame_t *frame = NULL;
mt_data = &healer->mt_data;
@@ -799,6 +803,8 @@ _afr_mt_create_process_entries_task (xlator_t *this,
if (!task_ctx)
goto err;
+ task_ctx->frame = afr_frame_create (this);
+
INIT_LIST_HEAD (&task_ctx->list);
task_ctx->readdir_xl = this;
task_ctx->healer = healer;
@@ -812,7 +818,7 @@ _afr_mt_create_process_entries_task (xlator_t *this,
// This returns immediately, and afr_mt_process_entries_done will
// be called when the task is completed e.g. our queue is empty
ret = synctask_new (this->ctx->env, afr_mt_process_entries_task,
- afr_mt_process_entries_done, NULL,
+ afr_mt_process_entries_done, task_ctx->frame,
(void *)task_ctx);
if (!ret) {
diff --git a/xlators/cluster/afr/src/afr-self-heald.h b/xlators/cluster/afr/src/afr-self-heald.h
index 817e712..1588fc8 100644
--- a/xlators/cluster/afr/src/afr-self-heald.h
+++ b/xlators/cluster/afr/src/afr-self-heald.h
@@ -74,6 +74,7 @@ typedef struct afr_mt_process_entries_task_ctx_ {
subvol_healer_t *healer;
xlator_t *readdir_xl;
inode_t *idx_inode; /* inode ref for xattrop dir */
+ call_frame_t *frame;
unsigned int entries_healed;
unsigned int entries_processed;
unsigned int already_healed;
Richard
________________________________________
From: Ravishankar N [ravishankar at redhat.com]
Sent: Sunday, February 07, 2016 11:15 PM
To: Shreyas Siravara
Cc: Richard Wareing; Vijay Bellur; Gluster Devel
Subject: Re: [Gluster-devel] Throttling xlator on the bricks
Hello,
On 01/29/2016 06:51 AM, Shreyas Siravara wrote:
> So the way our throttling works is (intentionally) very simplistic.
>
> (1) When someone mounts an NFS share, we tag the frame with a 32 bit hash of the export name they were authorized to mount.
> (2) io-stats keeps track of the "current rate" of fops we're seeing for that particular mount, using a sampling of fops and a moving average over a short period of time.
> (3) Based on whether the share violated its allowed rate (which is defined in a config file), we tag the FOP as "least-pri". Of course this makes the assumption that all NFS endpoints are receiving roughly the same # of FOPs. The rate defined in the config file is a *per* NFS endpoint number. So if your cluster has 10 NFS endpoints, and you've pre-computed that it can do roughly 1000 FOPs per second, the rate in the config file would be 100.
> (4) IO-Threads then shoves the FOP into the least-pri queue, rather than its default. The value is honored all the way down to the bricks.
>
> The code is actually complete, and I'll put it up for review after we iron out a few minor issues.
Did you get a chance to send the patch? Just wanted to run some tests
and see if this is all we need at the moment to regulate shd traffic,
especially with Richard's multi-threaded heal patch
https://urldefense.proofpoint.com/v2/url?u=http-3A__review.gluster.org_-23_c_13329_&d=CwIC-g&c=5VD0RTtNlTh3ycd41b3MUw&r=qJ8Lp7ySfpQklq3QZr44Iw&m=B873EiTlTeUXIjEcoutZ6Py5KL0bwXIVroPbpwaKD8s&s=fo86UTOQWXf0nQZvvauqIIhlwoZHpRlQMNfQd7Ubu7g&e= being revived and made ready for 3.8.
-Ravi
>
>> On Jan 27, 2016, at 9:48 PM, Ravishankar N <ravishankar at redhat.com> wrote:
>>
>> On 01/26/2016 08:41 AM, Richard Wareing wrote:
>>> In any event, it might be worth having Shreyas detail his throttling feature (that can throttle any directory hierarchy no less) to illustrate how a simpler design can achieve similar results to these more complicated (and it follows....bug prone) approaches.
>>>
>>> Richard
>> Hi Shreyas,
>>
>> Wondering if you can share the details of the throttling feature you're working on. Even if there's no code, a description of what it is trying to achieve and how will be great.
>>
>> Thanks,
>> Ravi
More information about the Gluster-devel
mailing list