[Gluster-users] glusterfsd Call Trace Messages

Raghavendra Bhat rabhat at redhat.com
Thu Feb 4 14:51:13 UTC 2016


It depends upon the memory available and the workload. In this case, the
size of the files being copied are huge. So more I/O happens to completely
copy the file.

Can you please give the o/p of "gluster volume info <volume name>"?

Regards,
Raghavendra

On Wed, Feb 3, 2016 at 4:54 PM, Taste-Of-IT <kontakt at taste-of-it.de> wrote:

> Am 2016-02-03 21:24, schrieb Raghavendra Bhat:
>
>> I think this is what is happening. Someone please correct me if I am
>> wrong.
>>
>> I think this is happening because nfs client, nfs server and bricks
>> being in the same machine. What happens is, when the large write
>> comes, nfs client sends the request to the nfs server and the nfs
>> server sends it to the brick. The brick process tries to write it via
>> making the write system call and the call enters the kernel. Kernel
>> might not find memory available for performing the operation and thus
>> wants to free some memory. NFS client does heavy caching. It might
>> have saved many things in its memory. So, it has to free some memory.
>> But nfs client is stuck with the write operation. It is still waiting
>> for a response from the server. So it will not be able to free the
>> memory till it gets a response from the nfs server (which in turn is
>> waiting for a response from the brick) for the write operation it
>> sent. But brick cannot get a response from kernel until kernel is able
>> to get some memory for the operation and perform write.
>>
>> Thus it is stuck in this deadlock. Thats why you see your setup
>> blocked.
>>
>> Can you please mount your volume via nfs on a different node other
>> than the gluster server, and see if the issue happens again?
>>
>> Regards,
>> Raghavendra
>>
>> On Wed, Feb 3, 2016 at 2:32 PM, Taste-Of-IT <kontakt at taste-of-it.de>
>> wrote:
>>
>> Am 2016-02-03 20:09, schrieb Raghavendra Bhat:
>>>
>>> Hi,
>>>
>>> Is your nfs client mounted on one of the gluster serves?
>>>
>>> Regards,
>>> Raghavendra
>>>
>>> On Wed, Feb 3, 2016 at 10:08 AM, Taste-Of-IT
>>> <kontakt at taste-of-it.de>
>>> wrote:
>>>
>>> Hello,
>>>
>>> hope some expert can help. I have a 2 Brick 1 Volume Distributed
>>> GlusterFS in Version 3.7.6 on Debian. The volume is shared via nfs.
>>> If i copy via midnight commander large files (>30GB), i got
>>> following messages. I replace sata cable, checked memory but i
>>> didnt
>>> find an error. SMART Values on all disks seems ok. After 30-40
>>> minutes i can copy again. Any Idea?
>>>
>>> Feb  3 12:46:31 gluster01 kernel: [11186.588367] [sched_delayed]
>>> sched: RT throttling activated
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932749] glusterfsd
>>>   D ffff88040ca6d788     0  1150      1 0x00000000
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932759]
>>> ffff88040ca6d330 0000000000000082 0000000000012f00 ffff88040ad1bfd8
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932767]
>>> 0000000000012f00 ffff88040ca6d330 ffff88040ca6d330 ffff88040ad1be88
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932773]
>>> ffff88040e18d4b8 ffff88040e18d4a0 ffffffff00000000 ffff88040e18d4a8
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932780] Call Trace:
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932796]
>>> [<ffffffff81512cd5>] ? rwsem_down_write_failed+0x1d5/0x320
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932807]
>>> [<ffffffff812b7d13>] ? call_rwsem_down_write_failed+0x13/0x20
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932816]
>>> [<ffffffff812325b0>] ? proc_keys_show+0x3f0/0x3f0
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932823]
>>> [<ffffffff81512649>] ? down_write+0x29/0x40
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932830]
>>> [<ffffffff811592bc>] ? vm_mmap_pgoff+0x6c/0xc0
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932838]
>>> [<ffffffff8116ea4e>] ? SyS_mmap_pgoff+0x10e/0x250
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932844]
>>> [<ffffffff811a969a>] ? SyS_readv+0x6a/0xd0
>>> Feb  3 12:56:09 gluster01 kernel: [11764.932853]
>>> [<ffffffff81513ccd>] ? system_call_fast_compare_end+0x10/0x15
>>> Feb  3 12:58:09 gluster01 kernel: [11884.979935] glusterfsd
>>>   D ffff88040ca6d788     0  1150      1 0x00000000
>>> Feb  3 12:58:09 gluster01 kernel: [11884.979945]
>>> ffff88040ca6d330 0000000000000082 0000000000012f00 ffff88040ad1bfd8
>>> Feb  3 12:58:09 gluster01 kernel: [11884.979952]
>>> 0000000000012f00 ffff88040ca6d330 ffff88040ca6d330 ffff88040ad1be88
>>> Feb  3 12:58:09 gluster01 kernel: [11884.979959]
>>> ffff88040e18d4b8 ffff88040e18d4a0 ffffffff00000000 ffff88040e18d4a8
>>> Feb  3 12:58:09 gluster01 kernel: [11884.979966] Call Trace:
>>> Feb  3 12:58:09 gluster01 kernel: [11884.979982]
>>> [<ffffffff81512cd5>] ? rwsem_down_write_failed+0x1d5/0x320
>>> Feb  3 12:58:09 gluster01 kernel: [11884.979993]
>>> [<ffffffff812b7d13>] ? call_rwsem_down_write_failed+0x13/0x20
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980001]
>>> [<ffffffff812325b0>] ? proc_keys_show+0x3f0/0x3f0
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980008]
>>> [<ffffffff81512649>] ? down_write+0x29/0x40
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980015]
>>> [<ffffffff811592bc>] ? vm_mmap_pgoff+0x6c/0xc0
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980023]
>>> [<ffffffff8116ea4e>] ? SyS_mmap_pgoff+0x10e/0x250
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980030]
>>> [<ffffffff811a969a>] ? SyS_readv+0x6a/0xd0
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980038]
>>> [<ffffffff81513ccd>] ? system_call_fast_compare_end+0x10/0x15
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980351] mc
>>>     D ffff88040e6d8fb8     0  5119   1447 0x00000000
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980358]
>>> ffff88040e6d8b60 0000000000000082 0000000000012f00 ffff88040d5dbfd8
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980365]
>>> 0000000000012f00 ffff88040e6d8b60 ffff88041ec937b0 ffff88041efcc9e8
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980371]
>>> 0000000000000002 ffffffff8113ce00 ffff88040d5dbcb0 ffff88040d5dbd98
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980377] Call Trace:
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980385]
>>> [<ffffffff8113ce00>] ? wait_on_page_read+0x60/0x60
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980392]
>>> [<ffffffff81510759>] ? io_schedule+0x99/0x120
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980399]
>>> [<ffffffff8113ce0a>] ? sleep_on_page+0xa/0x10
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980405]
>>> [<ffffffff81510adc>] ? __wait_on_bit+0x5c/0x90
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980412]
>>> [<ffffffff8113cbff>] ? wait_on_page_bit+0x7f/0x90
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980420]
>>> [<ffffffff810a7bd0>] ? autoremove_wake_function+0x30/0x30
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980426]
>>> [<ffffffff8114a17d>] ? pagevec_lookup_tag+0x1d/0x30
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980433]
>>> [<ffffffff8113cce0>] ? filemap_fdatawait_range+0xd0/0x160
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980442]
>>> [<ffffffff8113e7ca>] ? filemap_write_and_wait_range+0x3a/0x60
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980461]
>>> [<ffffffffa072363f>] ? nfs_file_fsync+0x7f/0x100 [nfs]
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980476]
>>> [<ffffffffa0723a2a>] ? nfs_file_write+0xda/0x1a0 [nfs]
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980484]
>>> [<ffffffff811a7e24>] ? new_sync_write+0x74/0xa0
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980492]
>>> [<ffffffff811a8562>] ? vfs_write+0xb2/0x1f0
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980500]
>>> [<ffffffff811a842d>] ? vfs_read+0xed/0x170
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980505]
>>> [<ffffffff811a90a2>] ? SyS_write+0x42/0xa0
>>> Feb  3 12:58:09 gluster01 kernel: [11884.980513]
>>> [<ffffffff81513ccd>] ? system_call_fast_compare_end+0x10/0x15
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users [1] [1]
>>>
>>> Links:
>>> ------
>>> [1] http://www.gluster.org/mailman/listinfo/gluster-users [1]
>>>
>>  Hi Raghavendra,
>>  yes in this case i have to mount on one of the gluster server, but it
>> doesnt matter on which i mount and its only a question of time when
>> the trace came.
>>  Taste
>>
>>  _______________________________________________
>>  Gluster-users mailing list
>>  Gluster-users at gluster.org
>>  http://www.gluster.org/mailman/listinfo/gluster-users [1]
>>
>>
>> Links:
>> ------
>> [1] http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
> Hi,
> sounds logical. Is that a normal behavior? I tested it from a client and
> it looks fine, without trace. I tried 4 files about 30GB. The only thing i
> notice is, that the first file was copied with nearly full bandwidth, over
> both server, but the second was only with 20-30 Percent of possible
> bandwith. are there any perforamnce / stable option which i can use for nfs
> or glusterfs mount?
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160204/8bd6160c/attachment.html>


More information about the Gluster-users mailing list