[Gluster-devel] [ovirt-users] Can we debug some truths/myths/facts about hosted-engine and gluster?
Vijay Bellur
vbellur at redhat.com
Wed Jul 23 22:07:27 UTC 2014
On 07/22/2014 07:21 AM, Itamar Heim wrote:
> On 07/22/2014 04:28 AM, Vijay Bellur wrote:
>> On 07/21/2014 05:09 AM, Pranith Kumar Karampuri wrote:
>>>
>>> On 07/21/2014 02:08 PM, Jiri Moskovcak wrote:
>>>> On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote:
>>>>>
>>>>> On 07/19/2014 11:25 AM, Andrew Lau wrote:
>>>>>>
>>>>>>
>>>>>> On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri
>>>>>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>>>>>>
>>>>>>
>>>>>> On 07/18/2014 05:43 PM, Andrew Lau wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur
>>>>>>> <vbellur at redhat.com <mailto:vbellur at redhat.com>> wrote:
>>>>>>>
>>>>>>> [Adding gluster-devel]
>>>>>>>
>>>>>>>
>>>>>>> On 07/18/2014 05:20 PM, Andrew Lau wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> As most of you have got hints from previous messages,
>>>>>>> hosted engine
>>>>>>> won't work on gluster . A quote from BZ1097639
>>>>>>>
>>>>>>> "Using hosted engine with Gluster backed storage is
>>>>>>> currently something
>>>>>>> we really warn against.
>>>>>>>
>>>>>>>
>>>>>>> I think this bug should be closed or re-targeted at
>>>>>>> documentation, because there is nothing we can do here.
>>>>>>> Hosted engine assumes that all writes are atomic and
>>>>>>> (immediately) available for all hosts in the cluster.
>>>>>>> Gluster violates those assumptions.
>>>>>>> "
>>>>>>>
>>>>>>> I tried going through BZ1097639 but could not find much
>>>>>>> detail with respect to gluster there.
>>>>>>>
>>>>>>> A few questions around the problem:
>>>>>>>
>>>>>>> 1. Can somebody please explain in detail the scenario that
>>>>>>> causes the problem?
>>>>>>>
>>>>>>> 2. Is hosted engine performing synchronous writes to ensure
>>>>>>> that writes are durable?
>>>>>>>
>>>>>>> Also, if there is any documentation that details the hosted
>>>>>>> engine architecture that would help in enhancing our
>>>>>>> understanding of its interactions with gluster.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Now my question, does this theory prevent a scenario of
>>>>>>> perhaps
>>>>>>> something like a gluster replicated volume being mounted
>>>>>>> as a glusterfs
>>>>>>> filesystem and then re-exported as the native kernel NFS
>>>>>>> share for the
>>>>>>> hosted-engine to consume? It could then be possible to
>>>>>>> chuck ctdb in
>>>>>>> there to provide a last resort failover solution. I have
>>>>>>> tried myself
>>>>>>> and suggested it to two people who are running a similar
>>>>>>> setup. Now
>>>>>>> using the native kernel NFS server for hosted-engine and
>>>>>>> they haven't
>>>>>>> reported as many issues. Curious, could anyone validate
>>>>>>> my theory on this?
>>>>>>>
>>>>>>>
>>>>>>> If we obtain more details on the use case and obtain gluster
>>>>>>> logs from the failed scenarios, we should be able to
>>>>>>> understand the problem better. That could be the first step
>>>>>>> in validating your theory or evolving further
>>>>>>> recommendations :).
>>>>>>>
>>>>>>>
>>>>>>> I'm not sure how useful this is, but Jiri Moskovcak tracked
>>>>>>> this down in an off list message.
>>>>>>>
>>>>>>> Message Quote:
>>>>>>>
>>>>>>> ==
>>>>>>>
>>>>>>> We were able to track it down to this (thanks Andrew for
>>>>>>> providing the testing setup):
>>>>>>>
>>>>>>> -b686-4363-bb7e-dba99e5789b6/ha_agent
>>>>>>> service_type=hosted-engine'
>>>>>>> Traceback (most recent call last):
>>>>>>> File
>>>>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> line 165, in handle
>>>>>>> response = "success " + self._dispatch(data)
>>>>>>> File
>>>>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py",
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> line 261, in _dispatch
>>>>>>> .get_all_stats_for_service_type(**options)
>>>>>>> File
>>>>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> line 41, in get_all_stats_for_service_type
>>>>>>> d = self.get_raw_stats_for_service_type(storage_dir,
>>>>>>> service_type)
>>>>>>> File
>>>>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> line 74, in get_raw_stats_for_service_type
>>>>>>> f = os.open(path, direct_flag | os.O_RDONLY)
>>>>>>> OSError: [Errno 116] Stale file handle:
>>>>>>> '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Andrew/Jiri,
>>>>>> Would it be possible to post gluster logs of both the
>>>>>> mount and bricks on the bz? I can take a look at it once. If I
>>>>>> gather nothing then probably I will ask for your help in
>>>>>> re-creating the issue.
>>>>>>
>>>>>> Pranith
>>>>>>
>>>>>>
>>>>>> Unfortunately, I don't have the logs for that setup any more.. I'll
>>>>>> try replicate when I get a chance. If I understand the comment from
>>>>>> the BZ, I don't think it's a gluster bug per-say, more just how
>>>>>> gluster does its replication.
>>>>> hi Andrew,
>>>>> Thanks for that. I couldn't come to any conclusions
>>>>> because no
>>>>> logs were available. It is unlikely that self-heal is involved because
>>>>> there were no bricks going down/up according to the bug description.
>>>>>
>>>>
>>>> Hi,
>>>> I've never had such setup, I guessed problem with gluster based on
>>>> "OSError: [Errno 116] Stale file handle:" which happens when the file
>>>> opened by application on client gets removed on the server. I'm pretty
>>>> sure we (hosted-engine) don't remove that file, so I think it's some
>>>> gluster magic moving the data around...
>>> Hi,
>>> Without bricks going up/down or there are new bricks added data is not
>>> moved around by gluster unless a file operation comes to gluster. So I
>>> am still not sure why this happened.
>>>
>>
>> Does hosted engine perform deletion & re-creation of file
>> <uuid>/ha_agent/hosted-engine.metadata in some operational sequence? In
>> such a case, if this file is accessed by a stale gfid, ESTALE is
>> possible.
>>
>> I see references to 2 hosted engines being operational in the bug report
>> and that makes me wonder if this is a likely scenario?
>>
>> I am also curious to understand why NFS was chosen as the access method
>> to the gluster volume. Isn't FUSE based access a possibility here?
>>
>
> it is, but it wasn't enabled in the setup due to multiple reports around
> gluster robustness with sanlock at the time.
> iiuc, with replica 3 we should be in a much better place and re-enable
> it (also validating its replica 3 probably?)
>
Yes, replica 3 would provide better protection for split-brains.
Thanks,
Vijay
More information about the Gluster-devel
mailing list