[Gluster-users] single problematic node (brick)
Franco Broi
franco.broi at iongeo.com
Wed May 21 03:37:46 UTC 2014
Are you running out of memory? How much memory are the gluster daemons
using?
On Tue, 2014-05-20 at 11:16 -0700, Doug Schouten wrote:
> Hello,
>
> I have a rather simple Gluster configuration that consists of 85TB
> distributed across six nodes. There is one particular node that seems to
> fail on a ~ weekly basis, and I can't figure out why.
>
> I have attached my Gluster configuration and a recent log file from the
> problematic node. For a user, when the failure occurs, the symptom is
> that any attempts to access the Gluster volume from the problematic node
> fails with "transport endpoint not connected" error.
>
> Restarting the Gluster daemons and remounting the volume on the failed
> node always fixes the problem. But usually by that point some number of
> jobs in our batch queue have failed b/c of this issue already, and it's
> becoming a headache.
>
> It could be a fuse issue, since I see many related error messages in the
> Gluster log, but I can't disentangle the various errors. The relevant
> line in my /etc/fstab file is
>
> server:global /global glusterfs
> defaults,direct-io-mode=disable,log-level=WARNING,log-file=/var/log/gluster.log
> 0 0
>
> Any ideas on the source of the problem? Could it be a hardware (network)
> glitch? The fact that it only happens on one node that is identically
> configured (with same hardware) as other nodes points to something like
> that.
>
> thanks! Doug
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
More information about the Gluster-users
mailing list