[Gluster-users] glusterfsd Call Trace Messages
Taste-Of-IT
kontakt at taste-of-it.de
Wed Feb 3 21:54:18 UTC 2016
Am 2016-02-03 21:24, schrieb Raghavendra Bhat:
> I think this is what is happening. Someone please correct me if I am
> wrong.
>
> I think this is happening because nfs client, nfs server and bricks
> being in the same machine. What happens is, when the large write
> comes, nfs client sends the request to the nfs server and the nfs
> server sends it to the brick. The brick process tries to write it via
> making the write system call and the call enters the kernel. Kernel
> might not find memory available for performing the operation and thus
> wants to free some memory. NFS client does heavy caching. It might
> have saved many things in its memory. So, it has to free some memory.
> But nfs client is stuck with the write operation. It is still waiting
> for a response from the server. So it will not be able to free the
> memory till it gets a response from the nfs server (which in turn is
> waiting for a response from the brick) for the write operation it
> sent. But brick cannot get a response from kernel until kernel is able
> to get some memory for the operation and perform write.
>
> Thus it is stuck in this deadlock. Thats why you see your setup
> blocked.
>
> Can you please mount your volume via nfs on a different node other
> than the gluster server, and see if the issue happens again?
>
> Regards,
> Raghavendra
>
> On Wed, Feb 3, 2016 at 2:32 PM, Taste-Of-IT <kontakt at taste-of-it.de>
> wrote:
>
>> Am 2016-02-03 20:09, schrieb Raghavendra Bhat:
>>
>> Hi,
>>
>> Is your nfs client mounted on one of the gluster serves?
>>
>> Regards,
>> Raghavendra
>>
>> On Wed, Feb 3, 2016 at 10:08 AM, Taste-Of-IT
>> <kontakt at taste-of-it.de>
>> wrote:
>>
>> Hello,
>>
>> hope some expert can help. I have a 2 Brick 1 Volume Distributed
>> GlusterFS in Version 3.7.6 on Debian. The volume is shared via nfs.
>> If i copy via midnight commander large files (>30GB), i got
>> following messages. I replace sata cable, checked memory but i
>> didnt
>> find an error. SMART Values on all disks seems ok. After 30-40
>> minutes i can copy again. Any Idea?
>>
>> Feb 3 12:46:31 gluster01 kernel: [11186.588367] [sched_delayed]
>> sched: RT throttling activated
>> Feb 3 12:56:09 gluster01 kernel: [11764.932749] glusterfsd
>> D ffff88040ca6d788 0 1150 1 0x00000000
>> Feb 3 12:56:09 gluster01 kernel: [11764.932759]
>> ffff88040ca6d330 0000000000000082 0000000000012f00 ffff88040ad1bfd8
>> Feb 3 12:56:09 gluster01 kernel: [11764.932767]
>> 0000000000012f00 ffff88040ca6d330 ffff88040ca6d330 ffff88040ad1be88
>> Feb 3 12:56:09 gluster01 kernel: [11764.932773]
>> ffff88040e18d4b8 ffff88040e18d4a0 ffffffff00000000 ffff88040e18d4a8
>> Feb 3 12:56:09 gluster01 kernel: [11764.932780] Call Trace:
>> Feb 3 12:56:09 gluster01 kernel: [11764.932796]
>> [<ffffffff81512cd5>] ? rwsem_down_write_failed+0x1d5/0x320
>> Feb 3 12:56:09 gluster01 kernel: [11764.932807]
>> [<ffffffff812b7d13>] ? call_rwsem_down_write_failed+0x13/0x20
>> Feb 3 12:56:09 gluster01 kernel: [11764.932816]
>> [<ffffffff812325b0>] ? proc_keys_show+0x3f0/0x3f0
>> Feb 3 12:56:09 gluster01 kernel: [11764.932823]
>> [<ffffffff81512649>] ? down_write+0x29/0x40
>> Feb 3 12:56:09 gluster01 kernel: [11764.932830]
>> [<ffffffff811592bc>] ? vm_mmap_pgoff+0x6c/0xc0
>> Feb 3 12:56:09 gluster01 kernel: [11764.932838]
>> [<ffffffff8116ea4e>] ? SyS_mmap_pgoff+0x10e/0x250
>> Feb 3 12:56:09 gluster01 kernel: [11764.932844]
>> [<ffffffff811a969a>] ? SyS_readv+0x6a/0xd0
>> Feb 3 12:56:09 gluster01 kernel: [11764.932853]
>> [<ffffffff81513ccd>] ? system_call_fast_compare_end+0x10/0x15
>> Feb 3 12:58:09 gluster01 kernel: [11884.979935] glusterfsd
>> D ffff88040ca6d788 0 1150 1 0x00000000
>> Feb 3 12:58:09 gluster01 kernel: [11884.979945]
>> ffff88040ca6d330 0000000000000082 0000000000012f00 ffff88040ad1bfd8
>> Feb 3 12:58:09 gluster01 kernel: [11884.979952]
>> 0000000000012f00 ffff88040ca6d330 ffff88040ca6d330 ffff88040ad1be88
>> Feb 3 12:58:09 gluster01 kernel: [11884.979959]
>> ffff88040e18d4b8 ffff88040e18d4a0 ffffffff00000000 ffff88040e18d4a8
>> Feb 3 12:58:09 gluster01 kernel: [11884.979966] Call Trace:
>> Feb 3 12:58:09 gluster01 kernel: [11884.979982]
>> [<ffffffff81512cd5>] ? rwsem_down_write_failed+0x1d5/0x320
>> Feb 3 12:58:09 gluster01 kernel: [11884.979993]
>> [<ffffffff812b7d13>] ? call_rwsem_down_write_failed+0x13/0x20
>> Feb 3 12:58:09 gluster01 kernel: [11884.980001]
>> [<ffffffff812325b0>] ? proc_keys_show+0x3f0/0x3f0
>> Feb 3 12:58:09 gluster01 kernel: [11884.980008]
>> [<ffffffff81512649>] ? down_write+0x29/0x40
>> Feb 3 12:58:09 gluster01 kernel: [11884.980015]
>> [<ffffffff811592bc>] ? vm_mmap_pgoff+0x6c/0xc0
>> Feb 3 12:58:09 gluster01 kernel: [11884.980023]
>> [<ffffffff8116ea4e>] ? SyS_mmap_pgoff+0x10e/0x250
>> Feb 3 12:58:09 gluster01 kernel: [11884.980030]
>> [<ffffffff811a969a>] ? SyS_readv+0x6a/0xd0
>> Feb 3 12:58:09 gluster01 kernel: [11884.980038]
>> [<ffffffff81513ccd>] ? system_call_fast_compare_end+0x10/0x15
>> Feb 3 12:58:09 gluster01 kernel: [11884.980351] mc
>> D ffff88040e6d8fb8 0 5119 1447 0x00000000
>> Feb 3 12:58:09 gluster01 kernel: [11884.980358]
>> ffff88040e6d8b60 0000000000000082 0000000000012f00 ffff88040d5dbfd8
>> Feb 3 12:58:09 gluster01 kernel: [11884.980365]
>> 0000000000012f00 ffff88040e6d8b60 ffff88041ec937b0 ffff88041efcc9e8
>> Feb 3 12:58:09 gluster01 kernel: [11884.980371]
>> 0000000000000002 ffffffff8113ce00 ffff88040d5dbcb0 ffff88040d5dbd98
>> Feb 3 12:58:09 gluster01 kernel: [11884.980377] Call Trace:
>> Feb 3 12:58:09 gluster01 kernel: [11884.980385]
>> [<ffffffff8113ce00>] ? wait_on_page_read+0x60/0x60
>> Feb 3 12:58:09 gluster01 kernel: [11884.980392]
>> [<ffffffff81510759>] ? io_schedule+0x99/0x120
>> Feb 3 12:58:09 gluster01 kernel: [11884.980399]
>> [<ffffffff8113ce0a>] ? sleep_on_page+0xa/0x10
>> Feb 3 12:58:09 gluster01 kernel: [11884.980405]
>> [<ffffffff81510adc>] ? __wait_on_bit+0x5c/0x90
>> Feb 3 12:58:09 gluster01 kernel: [11884.980412]
>> [<ffffffff8113cbff>] ? wait_on_page_bit+0x7f/0x90
>> Feb 3 12:58:09 gluster01 kernel: [11884.980420]
>> [<ffffffff810a7bd0>] ? autoremove_wake_function+0x30/0x30
>> Feb 3 12:58:09 gluster01 kernel: [11884.980426]
>> [<ffffffff8114a17d>] ? pagevec_lookup_tag+0x1d/0x30
>> Feb 3 12:58:09 gluster01 kernel: [11884.980433]
>> [<ffffffff8113cce0>] ? filemap_fdatawait_range+0xd0/0x160
>> Feb 3 12:58:09 gluster01 kernel: [11884.980442]
>> [<ffffffff8113e7ca>] ? filemap_write_and_wait_range+0x3a/0x60
>> Feb 3 12:58:09 gluster01 kernel: [11884.980461]
>> [<ffffffffa072363f>] ? nfs_file_fsync+0x7f/0x100 [nfs]
>> Feb 3 12:58:09 gluster01 kernel: [11884.980476]
>> [<ffffffffa0723a2a>] ? nfs_file_write+0xda/0x1a0 [nfs]
>> Feb 3 12:58:09 gluster01 kernel: [11884.980484]
>> [<ffffffff811a7e24>] ? new_sync_write+0x74/0xa0
>> Feb 3 12:58:09 gluster01 kernel: [11884.980492]
>> [<ffffffff811a8562>] ? vfs_write+0xb2/0x1f0
>> Feb 3 12:58:09 gluster01 kernel: [11884.980500]
>> [<ffffffff811a842d>] ? vfs_read+0xed/0x170
>> Feb 3 12:58:09 gluster01 kernel: [11884.980505]
>> [<ffffffff811a90a2>] ? SyS_write+0x42/0xa0
>> Feb 3 12:58:09 gluster01 kernel: [11884.980513]
>> [<ffffffff81513ccd>] ? system_call_fast_compare_end+0x10/0x15
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users [1] [1]
>>
>> Links:
>> ------
>> [1] http://www.gluster.org/mailman/listinfo/gluster-users [1]
> Hi Raghavendra,
> yes in this case i have to mount on one of the gluster server, but it
> doesnt matter on which i mount and its only a question of time when
> the trace came.
> Taste
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users [1]
>
>
> Links:
> ------
> [1] http://www.gluster.org/mailman/listinfo/gluster-users
Hi,
sounds logical. Is that a normal behavior? I tested it from a client and
it looks fine, without trace. I tried 4 files about 30GB. The only thing
i notice is, that the first file was copied with nearly full bandwidth,
over both server, but the second was only with 20-30 Percent of possible
bandwith. are there any perforamnce / stable option which i can use for
nfs or glusterfs mount?
More information about the Gluster-users
mailing list