[Gluster-devel] dovecot server hanging with fuse/glusterfs errors

Matthias Albert gluster at linux4experts.de
Mon Jan 28 11:29:24 UTC 2008


Hi,

are you using this setup in a production environment? You're using "very 
old" packages from both glusterfs and fuse. There were many 
changes/bugfixes and so on in the last few releases.

We're using the latest TLA version (TLA643 and fuse 2.7.2-glfs8) for 
bacula backups and samba reexport without any problems. We're also 
playing with mysql on glusterfs and till now it works pretty well. But 
as I wrote "we're playing with mysql and also samba reexport" :-) if 
there are problems (e.g. crashes) -> it doesn't matter.

If you would like to install the latest TLA version of glusterfs and 
also the latest fuse version, feel free to use my debian packages (as 
soon as 1.3.8 is released we will upload them as the official debian 
packages into unstable).

---snip---
deb http://linux4experts.de/debian binary/
deb-src http://linux4experts.de/debian source/
---snap---

Best regards,

    Matthias


Jordi Moles schrieb:
> hi,
>
> these the packages of fuse and glusterfs i've got installed:
>
> ii  fuse-utils                     
> 2.7.0-glfs4-1                            Filesystem in USErspace 
> (utilities)
> ii  glusterfs-client               
> 1.3.2                                    GlusterFS fuse client
> ii  libfuse2                       
> 2.7.0-glfs4-1                            Filesystem in USErspace library
> ii  glusterfs-client               
> 1.3.2                                    GlusterFS fuse client
> ii  libglusterfs                   
> 1.3.2                                    GlusterFS libraries and 
> translator modules
>
> as for the xen thing....
>
> Yes, i've already tried to set up a couple of server which are 
> non-virtual machines, and they hang anyway with the very same error. I 
> don't know if they last longer, i couldn't tell you, but they 
> eventually hang.
>
> i'm afraid it has something to do with indexes and so on... but i 
> don't what to stard with, because i've enabled all the features from 
> dovecot that allows to work on shared file systems.
>
> thank you.
>
> En/na Matthias Albert ha escrit:
>> Hi Jordi,
>>
>> which glusterfs and fuse version are you running? Have you also tried 
>> your setup without Xen?
>>
>> Best regards,
>>
>>   Matthias
>>
>>
>> Jordi Moles schrieb:
>>> hi.
>>>
>>> i've got a clustered mail system with glusterfs and some postfix, 
>>> squirrelmail and dovecot machines sharing the same storage system 
>>> through glusterfs.
>>> each server has only postfix or squirrel or dovecot installed on it.
>>> The thing is... dovecot servers hang very often and the last thing 
>>> they always log is this:
>>>
>>> *************
>>>
>>> login: Unable to handle kernel paging request at 0000000000100108 RIP:
>>> [<ffffffff88020838>] :fuse:request_end+0x45/0x109
>>> PGD 1f729067 PUD 1faae067 PMD 0
>>> Oops: 0002 [1] SMP
>>> CPU 0
>>> Modules linked in: ipv6 fuse dm_snapshot dm_mirror dm_mod
>>> Pid: 678, comm: glusterfs Not tainted 2.6.18-xen #1
>>> RIP: e030:[<ffffffff88020838>]  [<ffffffff88020838>] 
>>> :fuse:request_end+0x45/0x109
>>> RSP: e02b:ffff88001f04dd68  EFLAGS: 00010246
>>> RAX: 0000000000200200 RBX: ffff88001db9fa58 RCX: ffff88001db9fa68
>>> RDX: 0000000000100100 RSI: ffff88001db9fa58 RDI: ffff88001f676400
>>> RBP: ffff88001f676400 R08: 00000000204abb00 R09: ffff88001db9fb58
>>> R10: 0000000000000008 R11: ffff88001f04dcf0 R12: 0000000000000000
>>> R13: ffff88001db9fa90 R14: ffff88001f04ddf8 R15: 0000000000000001
>>> FS:  00002b29187c53b0(0000) GS:ffffffff804cd000(0000) 
>>> knlGS:0000000000000000
>>> CS:  e033 DS: 0000 ES: 0000
>>> Process glusterfs (pid: 678, threadinfo ffff88001f04c000, task 
>>> ffff88001fc5a860)
>>> Stack:  ffff88001db9fa58 ffff88001f676400 00000000fffffffe 
>>> ffffffff88021056
>>> ffff88001f04def8 000000301f04de88 ffffffff8020dd40 ffff88001f04ddb8
>>> ffffffff80225ca3 ffff88001de23500 ffffffff803ef023 ffff88001f04de98
>>> Call Trace:
>>> [<ffffffff88021056>] :fuse:fuse_dev_readv+0x385/0x435
>>> [<ffffffff8020dd40>] monotonic_clock+0x35/0x7d
>>> [<ffffffff80225ca3>] deactivate_task+0x1d/0x28
>>> [<ffffffff803ef023>] thread_return+0x0/0x120
>>> [<ffffffff802801d3>] do_readv_writev+0x271/0x294
>>> [<ffffffff802274c7>] default_wake_function+0x0/0xe
>>> [<ffffffff803f0976>] __down_read+0x12/0xec
>>> [<ffffffff88021120>] :fuse:fuse_dev_read+0x1a/0x1f
>>> [<ffffffff802804bc>] vfs_read+0xcb/0x171
>>> [<ffffffff802274c7>] default_wake_function+0x0/0xe
>>> [<ffffffff8028089b>] sys_read+0x45/0x6e
>>> [<ffffffff8020a436>] system_call+0x86/0x8b
>>> [<ffffffff8020a3b0>] system_call+0x0/0x8b
>>>
>>>
>>> Code: 48 89 42 08 48 89 10 48 c7 41 08 00 02 20 00 f6 46 30 08 48
>>> RIP  [<ffffffff88020838>] :fuse:request_end+0x45/0x109
>>> RSP <ffff88001f04dd68>
>>> CR2: 0000000000100108
>>>  dovecot01gluster01 kernel: Oops: 0002 [1] SMP
>>>                                                 dovecot01gluster01 
>>> kernel: CR2: 0000000000100108
>>>                                                                                                 
>>> <3>BUG: soft lockup detected on CPU#0!
>>>
>>> Call Trace:
>>> <IRQ> [<ffffffff80257f78>] softlockup_tick+0xd8/0xea
>>> [<ffffffff8020f110>] timer_interrupt+0x3a9/0x405
>>> [<ffffffff80258264>] handle_IRQ_event+0x4e/0x96
>>> [<ffffffff80258350>] __do_IRQ+0xa4/0x105
>>> [<ffffffff8020b0e8>] call_softirq+0x1c/0x28
>>> [<ffffffff8020cecb>] do_IRQ+0x65/0x73
>>> [<ffffffff8034a8c1>] evtchn_do_upcall+0xac/0x12d
>>> [<ffffffff8020ac1e>] do_hypervisor_callback+0x1e/0x2c
>>> <EOI> [<ffffffff802d6f56>] dummy_inode_permission+0x0/0x3
>>> [<ffffffff8028c9b8>] do_lookup+0x63/0x173
>>> [<ffffffff803f1232>] .text.lock.spinlock+0x0/0x8a
>>> [<ffffffff8802144c>] :fuse:request_send+0x1b/0x2a8
>>> [<ffffffff8028f0d6>] __link_path_walk+0xdf2/0xf3c
>>> [<ffffffff80261275>] __do_page_cache_readahead+0x8a/0x28f
>>> [<ffffffff8802249f>] :fuse:fuse_dentry_revalidate+0x94/0x120
>>> [<ffffffff80299838>] mntput_no_expire+0x19/0x8b
>>> [<ffffffff8028f2f3>] link_path_walk+0xd3/0xe5
>>> [<ffffffff8029571e>] __d_lookup+0xb0/0xff
>>> [<ffffffff8028ca92>] do_lookup+0x13d/0x173
>>> [<ffffffff8028e687>] __link_path_walk+0x3a3/0xf3c
>>> [<ffffffff8028f27c>] link_path_walk+0x5c/0xe5
>>> [<ffffffff80219b90>] do_page_fault+0xee9/0x1215
>>> [<ffffffff8027e38d>] fd_install+0x25/0x5f
>>> [<ffffffff8025daac>] filemap_nopage+0x188/0x324
>>> [<ffffffff8028f6df>] do_path_lookup+0x270/0x2ec
>>> [<ffffffff8028e0c6>] getname+0x15b/0x1c1
>>> [<ffffffff8028ff52>] __user_walk_fd+0x37/0x4c
>>> [<ffffffff80288883>] vfs_stat_fd+0x1b/0x4a
>>> [<ffffffff80219b90>] do_page_fault+0xee9/0x1215
>>> [<ffffffff8027e38d>] fd_install+0x25/0x5f
>>> [<ffffffff80239287>] do_sigaction+0x7a/0x1f3
>>> [<ffffffff80288a4e>] sys_newstat+0x19/0x31
>>> [<ffffffff80239495>] sys_rt_sigaction+0x59/0x98
>>> [<ffffffff8020ab73>] error_exit+0x0/0x71
>>> [<ffffffff8020a436>] system_call+0x86/0x8b
>>> [<ffffffff8020a3b0>] system_call+0x0/0x8b
>>>
>>>
>>>
>>> *************
>>>
>>> these are basic debian etch brand new set ups, they have nothing 
>>> else than dovecot and glusterfs-client installed on them. The thing 
>>> is all machines i've got have the very same version of 
>>> glusterfs-client, but only the dovecot ones hang.
>>>
>>> Do you have any idea?
>>>
>>> Thank you.
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at nongnu.org
>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>
>>
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at nongnu.org
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>






More information about the Gluster-devel mailing list