[Gluster-devel] Sick but still "alive" nodes

Fri Jan 25 13:43:41 UTC 2013

On 01/25/2013 07:47 AM, jayunit100 at gmail.com wrote:
> Hi guys: I just saw an issue on the Hfds mailing list that might be a
> potential problem in gluster clusters.  It kind of reminds me of
> Jeff's idea of bricks as first class objects in the API.
>
> What happens if a gluster brick is on a machine which, although still
> alive, performs poorly?
>
> would such scenarios be detected and if so, can the brick be
> decommissioned/ignored/moved ? If not it would be a cool feature to
> have because I'm sure it happens from time to time.

There's nothing currently in place to detect such a condition, and of 
course if we can't detect it we can't do anything about it.  There are 
also several cases where we might actually manage to make things worse 
if we try to do this ourselves.  For example, consider the case where 
the slowness is because of a short-duration contending activity.  We 
might well react just as that activity subsides, suspending that brick 
just as another brick is "going bad" due to similar transient activity 
there.  Similarly, if the system overall is truly overloaded, suspending 
bricks is a bit like squeezing a water balloon - the "bulge" just 
reappears elsewhere and all we've done is diminish total resources 
available.

I've seen problems like this with other parallel filesystems, and I'm 
pretty sure I've read papers about them too.  IMO the right place to 
deal with such issues is at the job-scheduler or similar level, where 
more of the total system state is known.  What we can do is provide more 
information about our part of the system state, plus levers that they 
can pull when they decide that preparation or correction for a 
higher-level event (that we probably don't even know about) is appropriate.