[Gluster-users] Is rebalance completely broken on 3.5.3 ?
Alessandro Ipe
Alessandro.Ipe at meteo.be
Mon Mar 23 10:41:23 UTC 2015
Hi Olav,
Thanks for the info. I read the whole thread that you sent me... and I am more scared
than ever... The fact that the developers do not have a clue of what is causing this
issue is just frightening.
Concerning my issue, apparently after two days (a full heal is ongoing on the volume),
I did not get any error messages from the client when trying to list the incriminate
files, but I got twice the same file .forward with the same content, size, permissions
and date... which is consistent to what you got previously... I simply remove TWICE
the file with rm on the client and copy back a sane version. The one million dollar
question is : are there more files in a similar state on my 90 TB volume ? I am
delaying a find on the whole volume to find out...
What also concerns me is the absence of aknowledgement or reply from the
developers concerning this severe issue... The fact that only end-users on production
setup hit this issue while it cannot be reproduced in labs should be a clear signal that
this should addressed in priority, from my point of view. And lab testing should also try
to mimic real life use, with bricks servers under heavy load (> 10), with several tens
of client accessing the gluster volume to track down all possible issues resulting from
either network, i/o, ... timeouts.
Thanks for your help,
Alessandro.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150323/27f9542c/attachment.html>
More information about the Gluster-users
mailing list