[Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

Well…OK.  The “on my own” comment came from me after a long time and a lot of work trying to figure this out.  I just went back and checked – I posted the original question on 7/8 at 8:18 PM.  I asked a follow-up question 7/9 at 12:27 PM, roughly 16 hours later.  And then the “On my own” post was on 7/10 at 12:27 AM, or around 28 hours after my original question.   I was feeling kind of lonely and, well, on my own at the time.  All times are USA Central time.  I do work weird hours.

I certainly don’t mean to be a troll and even after all these years, I still don’t know what a troll is.  All I know is, I need help with this issue and I appreciate the advice so far.  And, frankly, without community help solving or mitigating this problem, I can’t use Gluster for my HA application because the behavior I observed creates 2 single points of failure instead of eliminating a single point of failure with redundancy.   Which creates a serious headache and I would think a problem the whole community would want to overcome.

I gave the best info I know how to give, and I did a bunch of work to try to characterize the problem and take my application out of the mix.  If I can do more or provide more info, just tell me what I can provide while I have everything sitting here in a testbed.
My challenge now is, this project was supposed to be delivered several days ago.   I’ll be out of town tomorrow so I may not be able to get back to it until Saturday.  I just don’t feel good delivering this system until I can test it some more and understand what’s going on with the issue I stumbled upon.

