[Gluster-users] [Gluster-devel] On Gluster resiliency
raghavendra at gluster.com
Mon Jan 2 05:38:46 UTC 2017
That's good to hear. Thanks for posting :)
On Fri, Dec 23, 2016 at 10:10 PM, Ivan Rossi <rouge2507 at gmail.com> wrote:
> Last few days has been tense because a R3 3.8.5 Gluster cluster that I
> built has been plagued by problems.
> The first symptom has been a continuous stream in the client logs of:
> [2016-12-17 15:55:02.047508] E [MSGID: 108009]
> 0-hisap-prod-1-replicate-0: Failed to open
> /home/galaxy/HISAP/java/lib/java/jre1.7.0_51/jre/lib/rt.jar on subvolume
> hisap-prod-1-client-2 [Transport endpoint is not connected]
> followed by very frequent peer disconnections/reconnections and a
> continuous stream of files to be healed on several volumes.
> The problem has been traced back to a flaky X540-T2 10GBE NIC embedded
> in one of the peers motherboard, that was incapable of keeping the
> correct 10Gbit speed negotiation with the switch.
> The motherboard has been replaced on the peer. and then the volumes
> healed quickly to complete health. All of these while the users kept
> running some heavy-duty bioinformatics applications (NGS data
> analysis) on top of Gluster. No user noticed ANYTHING despite a major
> hardware problem and offi-lining of a peer.
> This is a RESILIENT system, in my book.
> Gluster people, despite the constant stream of problems and requests
> for help that you see on the ML and IRC, rest assured that you are
> building a nice piece of software, at least IMHO.
> Keep-up the good work and Merry Christmas.
> Ivan Rossi
> Gluster-devel mailing list
> Gluster-devel at gluster.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Gluster-users