[Gluster-devel] glusterfs v1.3.8 client segfaulting in io-cache

Wed May 7 06:21:29 UTC 2008

Amar, quick question. I've switched to readahead but really wish I  
could use io-cache. How likely do you think changing block-size from  
128kb to 1MB (the same as what stripe uses, based on your advice)  
would fix the crash issue?

Dan Parsons

On May 6, 2008, at 12:43 PM, Amar S. Tumballi wrote:

> Mostly give the same block-size (and page-size) in stripe and io- 
> cache. Just
> for checking it. But currently you can fall back to read-ahead.
>
> Regards,
> Amar
>
> On Tue, May 6, 2008 at 12:38 PM, Dan Parsons <dparsons at nyip.net>  
> wrote:
>
>> Ah, so it's not something I'm doing wrong? Do you think changing
>> cache-size back to 32MB will prevent the problem from happening?
>>
>> Perhaps I should switch to readahead until fix?
>>
>>
>> Dan Parsons
>>
>>
>>
>> On May 6, 2008, at 12:37 PM, Amar S. Tumballi wrote:
>>
>> Thanks for the bug report, We will get back to you in another 2-3  
>> days
>>> about it. Mostly with a fix :)
>>>
>>> Regards,
>>> Amar
>>>
>>> On Tue, May 6, 2008 at 10:14 AM, Dan Parsons <dparsons at nyip.net>  
>>> wrote:
>>> Oh, one more useful bit of information, I see lines like the below  
>>> a lot
>>> in glusterfs log files, what do they mean?
>>>
>>> 2008-05-05 21:20:11 W [fuse-bridge.c:402:fuse_entry_cbk] glusterfs- 
>>> fuse:
>>> 18054459: (34)
>>> /bio/data/fast-hmmsearch-all/tmpDCex3b_fast-hmmsearch-all_job/ 
>>> result.tigrfam.TIGR02736.hmmhits
>>> => 610503040 Rehashing because st_nlink less than dentry maps
>>>
>>> Dan Parsons
>>>
>>>
>>>
>>> On May 6, 2008, at 10:13 AM, Dan Parsons wrote:
>>>
>>> I'm experiencing a glusterfs client crash, signal 11, under the io- 
>>> cache
>>> xlator. This is on our bioinformatics cluster- the crash happened  
>>> on 2 out
>>> of 33 machines. I've verified the hardware stability of the  
>>> machines.
>>>
>>> Running v1.3.8 built May 5th, 2008 from latest downloadable version.
>>>
>>> Here is the crash message:
>>>
>>> [0xffffe420]
>>>
>>> /usr/local/lib/glusterfs/1.3.8/xlator/performance/io- 
>>> cache.so(ioc_page_wakeup+0x67)[0xb76c5f67]
>>>
>>> /usr/local/lib/glusterfs/1.3.8/xlator/performance/io- 
>>> cache.so(ioc_inode_wakeup+0xb2)[0xb76c6902]
>>>
>>> /usr/local/lib/glusterfs/1.3.8/xlator/performance/io- 
>>> cache.so(ioc_cache_validate_cbk+0xae)[0xb76c1e5e]
>>>
>>> /usr/local/lib/glusterfs/1.3.8/xlator/cluster/ 
>>> stripe.so(stripe_stack_unwind_buf_cbk+0x98)[0xb76cd038]
>>>
>>> /usr/local/lib/glusterfs/1.3.8/xlator/protocol/ 
>>> client.so(client_fstat_cbk+0xcc)[0xb76dd13c]
>>>
>>> /usr/local/lib/glusterfs/1.3.8/xlator/protocol/client.so(notify 
>>> +0xa97)[0xb76db117]
>>> /usr/local/lib/libglusterfs.so.0(transport_notify+0x38)[0xb7efe978]
>>> /usr/local/lib/libglusterfs.so.0(sys_epoll_iteration+0xd6) 
>>> [0xb7eff906]
>>> /usr/local/lib/libglusterfs.so.0(poll_iteration+0x98)[0xb7efeb28]
>>> [glusterfs](main+0x85e)[0x804a14e]
>>> /lib/libc.so.6(__libc_start_main+0xdc)[0x7b1dec]
>>> [glusterfs][0x8049391]
>>>
>>> And here is my config file. The only thing I can think of is maybe  
>>> my
>>> cache-size is too big. I want a lot of cache, we have big files,  
>>> and the
>>> boxes have the RAM. Anyway, below is the config. If you see any  
>>> problems
>>> with it, please let me know. There are no errors on the glusterfsd  
>>> servers,
>>> except for an EOF from the machines where glusterfs client  
>>> segfaulted.
>>>
>>> volume fuse
>>> type mount/fuse
>>> option direct-io-mode 1
>>> option entry-timeout 1
>>> option attr-timeout 1
>>> option mount-point /glusterfs
>>> subvolumes ioc
>>> end-volume
>>>
>>> volume ioc
>>> type performance/io-cache
>>> option priority *.psiblast:3,*.seq:2,*:1
>>> option force-revalidate-timeout 5
>>> option cache-size 1200MB
>>> option page-size 128KB
>>> subvolumes stripe0
>>> end-volume
>>>
>>> volume stripe0
>>> type cluster/stripe
>>> option alu.disk-usage.exit-threshold 100MB
>>> option alu.disk-usage.entry-threshold 2GB
>>> option alu.write-usage.exit-threshold 4%
>>> option alu.write-usage.entry-threshold 20%
>>> option alu.read-usage.exit-threshold 4%
>>> option alu.read-usage.entry-threshold 20%
>>> option alu.order read-usage:write-usage:disk-usage
>>> option scheduler alu
>>> option block-size *:1MB
>>> subvolumes distfs01 distfs02 distfs03 distfs04
>>> end-volume
>>>
>>> volume distfs04
>>> type protocol/client
>>> option remote-subvolume brick
>>> option remote-host 10.8.101.54
>>> option transport-type tcp/client
>>> end-volume
>>>
>>> volume distfs03
>>> type protocol/client
>>> option remote-subvolume brick
>>> option remote-host 10.8.101.53
>>> option transport-type tcp/client
>>> end-volume
>>>
>>> volume distfs02
>>> type protocol/client
>>> option remote-subvolume brick
>>> option remote-host 10.8.101.52
>>> option transport-type tcp/client
>>> end-volume
>>>
>>> volume distfs01
>>> type protocol/client
>>> option remote-subvolume brick
>>> option remote-host 10.8.101.51
>>> option transport-type tcp/client
>>> end-volume
>>>
>>>
>>> Dan Parsons
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at nongnu.org
>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at nongnu.org
>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>
>>>
>>>
>>> --
>>> Amar Tumballi
>>> Gluster/GlusterFS Hacker
>>> [bulde on #gluster/irc.gnu.org]
>>> http://www.zresearch.com - Commoditizing Super Storage!
>>>
>>
>>
>>
>
>
> -- 
> Amar Tumballi
> Gluster/GlusterFS Hacker
> [bulde on #gluster/irc.gnu.org]
> http://www.zresearch.com - Commoditizing Super Storage!