[Gluster-users] blocking process on FUSE mount in directory which is using quota

Raghavendra Gowdappa rgowdapp at redhat.com
Wed Aug 15 03:21:43 UTC 2018


On Tue, Aug 14, 2018 at 7:23 PM, mabi <mabi at protonmail.ch> wrote:

> Bad news: the process blocked happened again this time with another
> directory of another user which is NOT over his quota but which also has
> quota enabled.
>
> The symptoms on the Linux side are the same:
>
> [Tue Aug 14 15:30:33 2018] INFO: task php5-fpm:14773 blocked for more than
> 120 seconds.
> [Tue Aug 14 15:30:33 2018]       Not tainted 3.16.0-4-amd64 #1
> [Tue Aug 14 15:30:33 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [Tue Aug 14 15:30:33 2018] php5-fpm        D ffff8801fea13200     0
> 14773    729 0x00000000
> [Tue Aug 14 15:30:33 2018]  ffff880100bbe0d0 0000000000000282
> 0000000000013200 ffff880129bcffd8
> [Tue Aug 14 15:30:33 2018]  0000000000013200 ffff880100bbe0d0
> ffff880153ed0d68 ffff880129bcfee0
> [Tue Aug 14 15:30:33 2018]  ffff880153ed0d6c ffff880100bbe0d0
> 00000000ffffffff ffff880153ed0d70
> [Tue Aug 14 15:30:33 2018] Call Trace:
> [Tue Aug 14 15:30:33 2018]  [<ffffffff81522935>] ?
> schedule_preempt_disabled+0x25/0x70
> [Tue Aug 14 15:30:33 2018]  [<ffffffff815243d3>] ?
> __mutex_lock_slowpath+0xd3/0x1d0
> [Tue Aug 14 15:30:33 2018]  [<ffffffff811d7a93>] ?
> write_inode_now+0x93/0xc0
> [Tue Aug 14 15:30:33 2018]  [<ffffffff815244eb>] ? mutex_lock+0x1b/0x2a
> [Tue Aug 14 15:30:33 2018]  [<ffffffffa00df69f>] ? fuse_flush+0x8f/0x1e0
> [fuse]
> [Tue Aug 14 15:30:33 2018]  [<ffffffff811af393>] ? vfs_read+0x93/0x170
> [Tue Aug 14 15:30:33 2018]  [<ffffffff811ad14a>] ? filp_close+0x2a/0x70
> [Tue Aug 14 15:30:33 2018]  [<ffffffff811ad1af>] ? SyS_close+0x1f/0x50
> [Tue Aug 14 15:30:33 2018]  [<ffffffff81525d0d>] ?
> system_call_fast_compare_end+0x10/0x15
>
> and if I check this process it has state "D" which is "D = uninterruptible
> sleep".
>
> Now I also managed to take a statedump file as recommended but I see in
> its content under the "[io-cache.inode]" a "path=" which I would need to
> remove as it contains filenames for privacy reasons. Can I remove every
> "path=" line and still send you the statedump file for analysis?
>

Yes. Removing path is fine and statedumps will still be useful for
debugging the issue.


> Thank you.
>
>
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On August 14, 2018 10:48 AM, Nithya Balachandran <nbalacha at redhat.com>
> wrote:
>
> Thanks for letting us know. Sanoj, can you take a look at this?
>
> Thanks.
> Nithya
>
> On 14 August 2018 at 13:58, mabi <mabi at protonmail.ch> wrote:
>
>> As you mentioned after creating the /var/run/gluster directory I got a
>> statedump file in there.
>>
>> As a workaround I have now removed the quota for this specific directory
>> and as it is a production server I can currently not "play" with it by
>> adding the quota back and having the same problem as it requires me to
>> reboot the server with downtime...
>>
>> But I can confirm that by removing the quota from that directory, the
>> problem is gone (no more blocking processes such as "ls") so there must be
>> an issue or bug with the quota part of gluster.
>>
>>
>>
>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>> On August 10, 2018 4:19 PM, Nithya Balachandran <nbalacha at redhat.com>
>> wrote:
>>
>>
>>
>> On 9 August 2018 at 19:54, mabi <mabi at protonmail.ch> wrote:
>>
>>> Thanks for the documentation. On my client using FUSE mount I found the
>>> PID by using ps (output below):
>>>
>>> root       456     1  4 14:17 ?        00:05:15 /usr/sbin/glusterfs
>>> --volfile-server=gfs1a --volfile-id=myvol-private /mnt/myvol-private
>>>
>>> Then I ran the following command
>>>
>>> sudo kill -USR1 456
>>>
>>> but now I can't find where the files are stored. Are these supposed to
>>> be stored on the client directly? I checked /var/run/gluster and
>>> /var/log/gluster but could not see anything and /var/log/gluster does not
>>> even exist on the client.
>>>
>>
>> They are usually created in /var/run/gluster. You will need to create the
>> directory on the client if it does not exist.
>>
>>
>>
>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>>> On August 9, 2018 3:59 PM, Raghavendra Gowdappa <rgowdapp at redhat.com>
>>> wrote:
>>>
>>>
>>>
>>> On Thu, Aug 9, 2018 at 6:47 PM, mabi <mabi at protonmail.ch> wrote:
>>>
>>>> Hi Nithya,
>>>>
>>>> Thanks for the fast answer. Here the additional info:
>>>>
>>>> 1. gluster volume info
>>>>
>>>> Volume Name: myvol-private
>>>> Type: Replicate
>>>> Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: gfs1a:/data/myvol-private/brick
>>>> Brick2: gfs1b:/data/myvol-private/brick
>>>> Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter)
>>>> Options Reconfigured:
>>>> features.default-soft-limit: 95%
>>>> transport.address-family: inet
>>>> features.quota-deem-statfs: on
>>>> features.inode-quota: on
>>>> features.quota: on
>>>> nfs.disable: on
>>>> performance.readdir-ahead: on
>>>> client.event-threads: 4
>>>> server.event-threads: 4
>>>> auth.allow: 192.168.100.92
>>>>
>>>>
>>>>
>>>> 2. Sorry I have no clue how to take a "statedump" of a process on
>>>> Linux. Which command should I use for that? and which process would you
>>>> like, the blocked process (for example "ls")?
>>>>
>>>
>>> Statedumps are gluster specific. Please refer to
>>> https://docs.gluster.org/en/v3/Troubleshooting/statedump/ for
>>> instructions.
>>>
>>>
>>>>
>>>> Regards,
>>>> M.
>>>>
>>>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>>>> On August 9, 2018 3:10 PM, Nithya Balachandran <nbalacha at redhat.com>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Please provide the following:
>>>>
>>>>    1. gluster volume info
>>>>    2. statedump of the fuse process when it hangs
>>>>
>>>>
>>>> Thanks,
>>>> Nithya
>>>>
>>>> On 9 August 2018 at 18:24, mabi <mabi at protonmail.ch> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I recently upgraded my GlusterFS replica 2+1 (aribter) to version
>>>>> 3.12.12 and now I see a weird behaviour on my client (using FUSE mount)
>>>>> where I have processes (PHP 5.6 FPM) trying to access a specific directory
>>>>> and then the process blocks. I can't kill the process either, not even with
>>>>> kill -9. I need to reboot the machine in order to get rid of these blocked
>>>>> processes.
>>>>>
>>>>> This directory has one particularity compared to the other directories
>>>>> it is that it has reached it's quota soft-limit as you can see here in the
>>>>> output of gluster volume quota list:
>>>>>
>>>>>                   Path                   Hard-limit  Soft-limit
>>>>> Used  Available  Soft-limit exceeded? Hard-limit exceeded?
>>>>> ------------------------------------------------------------
>>>>> -------------------------------------------------------------------
>>>>> /directory                          100.0GB     80%(80.0GB)   90.5GB
>>>>>  9.5GB             Yes                   No
>>>>>
>>>>> That does not mean that it is the quota's fault but it might be a hint
>>>>> where to start looking for... And by the way can someone explain me what
>>>>> the soft-limit does? or does it not do anything special?
>>>>>
>>>>> Here is an the linux stack of a blocking process on that directory
>>>>> which happened with a simple "ls -la":
>>>>>
>>>>> [Thu Aug  9 14:21:07 2018] INFO: task ls:2272 blocked for more than
>>>>> 120 seconds.
>>>>> [Thu Aug  9 14:21:07 2018]       Not tainted 3.16.0-4-amd64 #1
>>>>> [Thu Aug  9 14:21:07 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>>>> disables this message.
>>>>> [Thu Aug  9 14:21:07 2018] ls              D ffff88017ef93200     0
>>>>> 2272   2268 0x00000004
>>>>> [Thu Aug  9 14:21:07 2018]  ffff88017653f490 0000000000000286
>>>>> 0000000000013200 ffff880174d7bfd8
>>>>> [Thu Aug  9 14:21:07 2018]  0000000000013200 ffff88017653f490
>>>>> ffff8800eeb3d5f0 ffff8800fefac800
>>>>> [Thu Aug  9 14:21:07 2018]  ffff880174d7bbe0 ffff8800eeb3d6d0
>>>>> ffff8800fefac800 ffff8800ffe1e1c0
>>>>> [Thu Aug  9 14:21:07 2018] Call Trace:
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffffa00dc50d>] ?
>>>>> __fuse_request_send+0xbd/0x270 [fuse]
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff810abce0>] ?
>>>>> prepare_to_wait_event+0xf0/0xf0
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffffa00e0791>] ?
>>>>> fuse_dentry_revalidate+0x181/0x300 [fuse]
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff811b944e>] ?
>>>>> lookup_fast+0x25e/0x2b0
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff811bacc5>] ?
>>>>> path_lookupat+0x155/0x780
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff81195715>] ?
>>>>> kmem_cache_alloc+0x75/0x480
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffffa00dfca9>] ?
>>>>> fuse_getxattr+0xe9/0x150 [fuse]
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff811bb316>] ?
>>>>> filename_lookup+0x26/0xc0
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff811bf594>] ?
>>>>> user_path_at_empty+0x54/0x90
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff81193e08>] ?
>>>>> kmem_cache_free+0xd8/0x210
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff811bf59f>] ?
>>>>> user_path_at_empty+0x5f/0x90
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff811b3d46>] ?
>>>>> vfs_fstatat+0x46/0x90
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff811b421d>] ?
>>>>> SYSC_newlstat+0x1d/0x40
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff811d34b8>] ?
>>>>> SyS_lgetxattr+0x58/0x80
>>>>> [Thu Aug  9 14:21:07 2018]  [<ffffffff81525d0d>] ?
>>>>> system_call_fast_compare_end+0x10/0x15
>>>>>
>>>>>
>>>>> My 3 gluster nodes are all Debian 9 and my client Debian 8.
>>>>>
>>>>> Let me know if you need more information.
>>>>>
>>>>> Best regards,
>>>>> Mabi
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180815/464d46b8/attachment.html>


More information about the Gluster-users mailing list