[Gluster-users] Is it possible to experience data loss while rebalancing a volume?
DUCARROZ Birgit
birgit.ducarroz at unifr.ch
Fri Jan 3 14:55:04 UTC 2020
Hi list, me again, sorry to bother you again.
Last week, I added 2 new servers to my existing cluster.
Every thing worked fine until I began to rebalance some volumes with
really a lot of files.
Rebalancing failed on some servers and I experienced a lot of data loss
which replicated on all servers.
I had no time to analyze the logfiles, but it happened while there was
another "transport endpoint not connected" error.
I had to put back these data from the last backup.
This is the situation:
I experience from time to time a "Transport endpoint not connected". I
posted this error with a lot of logfiles on a former post (Treat
"Transport Endpoint Not Connected When Writing a Lot of Files") started
on october 11, 2019.
We did not find the definitive reason of these errors, but Amar
suggested me to update to gluster version 7, which I did now on the two
additional servers.
Actually, these two servers are attached again to the former cluster and
I would try again to re-balance and then remove the old servers which
cause the transport endpont errors, but I'm hesitating, because people
will begin working again on Monday and a new data loss would be
catastrophic.
My questions:
a) Is it really possible to experience data loss when rebalancing?
b) Is it important from which server I start rebalance?
c) In case there is another data loss, how would it be possible to put
the files back directly from a brick?
My other solution would be to create a second cluster using the two new
servers plus a virtual server as arbiter and then migrating data from
backups, but I prefer to use gluster as it is and replicate data.
I would be interested if other people experienced data loss while
rebalancing.
Thank you for every suggestion.
Kind regards,
Birgit
More information about the Gluster-users
mailing list