[Gluster-users] Is it possible to experience data loss while rebalancing a volume?

DUCARROZ Birgit birgit.ducarroz at unifr.ch
Fri Jan 3 14:55:04 UTC 2020


Hi list, me again, sorry to bother you again.

Last week, I added 2 new servers to my existing cluster.
Every thing worked fine until I began to rebalance some volumes with 
really a lot of files.

Rebalancing failed on some servers and I experienced a lot of data loss 
which replicated on all servers.

I had no time to analyze the logfiles, but it happened while there was 
another "transport endpoint not connected" error.

I had to put back these data from the last backup.

This is the situation:
I experience from time to time a "Transport endpoint not connected". I 
posted this error with a lot of logfiles on a former post (Treat 
"Transport Endpoint Not Connected When Writing a Lot of Files") started 
on october 11, 2019.

We did not find the definitive reason of these errors, but Amar 
suggested me to update to gluster version 7, which I did now on the two 
additional servers.

Actually, these two servers are attached again to the former cluster and 
I would try again to re-balance and then remove the old servers which 
cause the transport endpont errors, but I'm hesitating, because people 
will begin working again on Monday and a new data loss would be 
catastrophic.

My questions:
a) Is it really possible to experience data loss when rebalancing?
b) Is it important from which server I start rebalance?
c) In case there is another data loss, how would it be possible to put 
the files back directly from a brick?

My other solution would be to create a second cluster using the two new 
servers plus a virtual server as arbiter and then migrating data from 
backups, but I prefer to use gluster as it is and replicate data.

I would be interested if other people experienced data loss while 
rebalancing.

Thank you for every suggestion.

Kind regards,
Birgit







More information about the Gluster-users mailing list