[Gluster-users] gluster fails under heavy array job load load
Alex Chekholko
chekh at stanford.edu
Fri Dec 13 22:00:19 UTC 2013
Hi Harry,
My best guess is that you overloaded your interconnect. Do you have
metrics for if/when your network was saturated? That would cause
Gluster clients to time out.
My best guess is that you went into the "E" state of your "USE
(Utilization, Saturation, Error)" spectrum.
IME, that is a common pattern for out Lustre/GPFS clients, you get all
kinds of weird error states if you manage to saturate your I/O for an
extended period of time and fill all of the buffers everywhere.
Regards,
Alex
On 12/12/2013 05:03 PM, harry mangalam wrote:
> Short version: Our gluster fs (~340TB) provides scratch space for a
> ~5000core academic compute cluster.
>
> Much of our load is streaming IO, doing a lot of genomics work, and that
> is the load under which we saw this latest failure.
>
--
Alex Chekholko chekh at stanford.edu
More information about the Gluster-users
mailing list