[Gluster-devel] [RFC] What if client fuse process crash?

Changwei Ge chge at linux.alibaba.com
Tue Aug 6 07:14:46 UTC 2019


On 2019/8/6 2:57 下午, Ravishankar N wrote:
>
> On 06/08/19 11:44 AM, Changwei Ge wrote:
>> Hi Ravishankar,
>>
>>
>> Thanks for your share, it's very useful to me.
>>
>> I am setting up a glusterfs storage cluster recently and the 
>> umount/mount recovering process bothered me.
> Hi Changwei,
> Why are you needing to do frequent remounts? If your gluster fuse 
> client is crashing frequently, that should be investigated and fixed. 
> If you have a reproducer, please raise a bug with all the details like 
> the glusterfs version, core files and log files.


Hi Ravi,

Actually, glusterfs client fuse process ran well in my environment. But 
high-availability and fault-tolerance are also my big concerns.

So I killed the fuse process to see what would happen. AFAIK, userspace 
processes are likely to be killed or crashed somehow, which is not under 
our control. :-(

Another scenario is *software upgrade*. Since we have to upgrade 
glusterfs client version in order to enrich features and fix bugs.  It 
will be friendly to applications if the upgrade is transparent.


Thanks,

Changwei


> Regards,
> Ravi
>>
>>
>> I happened to find some patches[1] from internet aiming to address 
>> such a problem but no idea why they were not managed to merge into 
>> glusterfs mainline.
>>
>> Do you know why?
>>
>>
>> Thanks,
>>
>> Changwei
>>
>>
>> [1]:
>>
>> https://review.gluster.org/#/c/glusterfs/+/16843/
>>
>> https://github.com/gluster/glusterfs/issues/242
>>
>>
>> On 2019/8/6 1:12 下午, Ravishankar N wrote:
>>> On 05/08/19 3:31 PM, Changwei Ge wrote:
>>>> Hi list,
>>>>
>>>> If somehow, glusterfs client fuse process dies. All subsequent file 
>>>> operations will be failed with error 'no connection'.
>>>>
>>>> I am curious if the only way to recover is umount and mount again?
>>> Yes, this is pretty much the case with all fuse based file systems. 
>>> You can use -o auto_unmount (https://review.gluster.org/#/c/17230/) 
>>> to automatically cleanup and not having to manually unmount.
>>>>
>>>> If so, that means all processes working on top of glusterfs have to 
>>>> close files, which sometimes is hard to be acceptable.
>>>
>>> There is 
>>> https://research.cs.wisc.edu/wind/Publications/refuse-eurosys11.html, 
>>> which claims to provide a framework for transparent failovers. I 
>>> can't find any publicly available code though.
>>>
>>> Regards,
>>> Ravi
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Changwei
>>>>
>>>>
>>>> _______________________________________________
>>>>
>>>> Community Meeting Calendar:
>>>>
>>>> APAC Schedule -
>>>> Every 2nd and 4th Tuesday at 11:30 AM IST
>>>> Bridge: https://bluejeans.com/836554017
>>>>
>>>> NA/EMEA Schedule -
>>>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>>>> Bridge: https://bluejeans.com/486278655
>>>>
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>


More information about the Gluster-devel mailing list