[Gluster-users] gluster & split-brain mess

Johan Moreels johan.moreels at meteo.be
Tue May 31 14:49:46 UTC 2016


Dear developers,


I would like to thank you for the great job in developing gluster for the 
open-source community and I value the various improvements in the successive 
versions from 3.4 up to 3.7 that I used.

However, as I already mentionned in my older posts, gluster has an issue to 
handle high loads, i.e. normal use, when multiple clients try to access its 
volume. This issue combined to the fact that in the past I used versions 
suffering from serious memory leaks (for example 3.6.9 which despite the fact 
that the issues were known has not been explicitly discouraged to use) which 
results in brick processes being killed by OOM kernel process. It results in 
about 40,000 files currently in split-brain conditions, while gluster was used 
normally.

The current advise from the developers to solve the split-brain cases is 
to fix them manually which in my case is just not possible. This is why I plea 
that you urgently decide to include the CLI tools that allow to 
list according to the type of split-brain (either files diff and/or attributes 
diff and if attributes, what kind of attributes, i.e. quota) and a automatic 
way to fix a specific type of split-brain. As an example, if split-brain is 
detected due to mismatch between quota attributes for many files, the CLI 
should allow to invalidate these attributes and refresh them at once for all 
these files. As I already said, it is nice to add new features, but it is a 
little bit problematic if some basic management/repair tools are lacking.

I must really stress that it is not the task of users to develop 
strategies/scripts to fix split-brain issues (additional amount of work to our 
normal work) but the developers, or am I missing something here ?

And finally, could you get rid of any gfid in the logs and heal CLI output but 
instead give the file or directory, because the gfid alone is simply useless 
to find the file/dir on multi-terabyte volume/bricks in a decent amount of 
time ?


Thanks,


Alessandro.



More information about the Gluster-users mailing list