[Gluster-devel] Implementing multiplexing for self heal client.
rkavunga at redhat.com
Tue Jan 8 06:53:04 UTC 2019
I have completed the patches and pushed for reviews. Please feel free to
raise your review concerns/suggestions.
On 12/24/18 3:58 PM, RAFI KC wrote:
> On 12/21/18 6:56 PM, Sankarshan Mukhopadhyay wrote:
>> On Fri, Dec 21, 2018 at 6:30 PM RAFI KC <rkavunga at redhat.com> wrote:
>>> Hi All,
>>> What is the problem?
>>> As of now self-heal client is running as one daemon per node, this
>>> even if there are multiple volumes, there will only be one self-heal
>>> daemon. So to take effect of each configuration changes in the cluster,
>>> the self-heal has to be reconfigured. But it doesn't have ability to
>>> dynamically reconfigure. Which means when you have lot of volumes in
>>> cluster, every management operation that involves configurations
>>> like volume start/stop, add/remove brick etc will result in self-heal
>>> daemon restart. If such operation is executed more often, it is not
>>> slow down self-heal for a volume, but also increases the slef-heal logs
>> What is the value of the number of volumes when you write "lot of
>> volumes"? 1000 volumes, more etc
> Yes, more than 1000 volumes. It also depends on how often you execute
> glusterd management operations (mentioned above). Each time self heal
> daemon is restarted, it prints the entire graph. This graph traces in
> the log will contribute the majority it's size.
>>> How to fix it?
>>> We are planning to follow a similar procedure as attach/detach graphs
>>> dynamically which is similar to brick multiplex. The detailed steps is
>>> as below,
>>> 1) First step is to make shd per volume daemon, to generate/reconfigure
>>> volfiles per volume basis .
>>> 1.1) This will help to attach the volfiles easily to existing
>>> shd daemon
>>> 1.2) This will help to send notification to shd daemon as each
>>> volinfo keeps the daemon object
>>> 1.3) reconfiguring a particular subvolume is easier as we can check
>>> the topology better
>>> 1.4) With this change the volfiles will be moved to workdir/vols/
>>> 2) Writing new rpc requests like attach/detach_client_graph function to
>>> support clients attach/detach
>>> 2.1) Also functions like graph reconfigure, mgmt_getspec_cbk has to
>>> be modified
>>> 3) Safely detaching a subvolume when there are pending frames to
>>> 3.1) We can mark the client disconnected and make all the frames to
>>> unwind with ENOTCONN
>>> 3.2) We can wait all the i/o to unwind until the new updated subvol
>>> 4) Handle scenarios like glusterd restart, node reboot, etc
>>> At the moment we are not planning to limit the number of heal subvolmes
>>> per process as, because with the current approach also for every volume
>>> heal was doing from a single process. We have not heared any major
>>> complains on this?
>> Is the plan to not ever limit or, have a throttle set to a default
>> high(er) value? How would system resources be impacted if the proposed
>> design is implemented?
> The plan is to implement in a way that it can support more than one
> multiplexed self-heal daemon. The throttling function as of now
> returns the same process to multiplex, but it can be easily modified
> to create a new process.
> This multiplexing logic won't utilize any additional resources that it
> currently does.
> Rafi KC
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Gluster-devel