[Gluster-users] Is rebalance completely broken on 3.5.3 ?

Mon Mar 23 10:41:23 UTC 2015

Hi Olav,

Thanks for the info. I read the whole thread that you sent me... and I am more scared 
than ever... The fact that the developers do not have a clue of what is causing this 
issue is just frightening.

Concerning my issue, apparently after two days (a full heal is ongoing on the volume), 
I did not get any error messages from the client when trying to list the incriminate 
files, but I got twice the same file .forward with the same content, size, permissions 
and date... which is consistent to what you got previously... I simply remove TWICE 
the file with rm on the client and copy back a sane version. The one million dollar 
question is : are there more files in a similar state on my 90 TB volume ? I am 
delaying a find on the whole volume to find out...

What also concerns me is the absence of aknowledgement or reply from the 
developers concerning this severe issue... The fact that only end-users on production 
setup hit this issue while it cannot be reproduced in labs should be a clear signal that 
this should addressed in priority, from my point of view. And lab testing should also try 
to mimic real life use, with bricks servers under heavy load (> 10), with several tens 
of client accessing the gluster volume to track down all possible issues resulting from 
either network, i/o, ... timeouts.

Thanks for your help,

Alessandro.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150323/27f9542c/attachment.html>