[Gluster-devel] Problems with ioc again

Mon Dec 22 16:50:21 UTC 2008

Just to confirm, you're saying echo 3 to that /proc entry and then re- 
run my various workloads and see if problem happens again?

Or echo 3 to that /proc entry on a node that is already using 12GB ioc  
memory and see if it drops down to 2GB?

which one?

Dan Parsons

On Dec 22, 2008, at 8:47 AM, Anand Avati wrote:

> Dan,
> can you do 'echo 3 > /proc/sys/vm/drop_caches' and see if the usage
> comes back to normal?
>
> avati
>
> 2008/12/22 Dan Parsons <dparsons at nyip.net>:
>> OK, I just had this problem again in a big way.
>>
>> root     26231  9.3 90.5 12676632 11141304 ?   Ssl  Dec17 659:31  
>> [glusterfs]
>>
>> That's 90.5% of 12GB RAM. cache-size is set to 2048mb. Miraculously  
>> this
>> node is still running, about 28 of my 33 nodes died over the  
>> weekend because
>> of this issue. We wanted to run some big jobs over the holiday  
>> break but
>> this crash is getting in the way.
>>
>> Is there *anything* that can be done?
>>
>> Dan Parsons
>>
>>
>> On Dec 17, 2008, at 3:28 PM, Anand Avati wrote:
>>
>>> Dan,
>>> I have a vague memory about giving a custom patch for io-cache.  
>>> Was that
>>> you? Can you mail me the diff and I can answer your question..
>>>
>>> Avati
>>>
>>> On Dec 17, 2008 2:34 PM, "Dan Parsons" <dparsons at nyip.net> wrote:
>>>
>>> I'd love to use 1.4rc4 but are there any issues in it that would  
>>> effect
>>> me?
>>> I have 4 glusterfs servers, each with 2gbit ethernet (bonded),  
>>> provididing
>>> sustained 8gbit/s to 33 client nodes. Below is my entire config  
>>> file. If
>>> you
>>> see anything in there using a system that is either buggy or non- 
>>> optimal
>>> in
>>> 1.4rc4, or would be difficult to upgrade, please let me know. If  
>>> not, I
>>> can
>>> possibly upgrade.
>>>
>>> Below is my current config file. The one I was using when gluster  
>>> was
>>> using
>>> all memory is identical except for 'cache-size' was changed to  
>>> 4096MB and
>>> 'page-size' was changed to 512KB.
>>>
>>> -----------
>>> ### Add client feature and attach to remote subvolume of server1
>>> volume distfs01
>>> type protocol/client
>>> option transport-type tcp/client     # for TCP/IP transport
>>> option remote-host 10.8.101.51      # IP address of the remote brick
>>> option remote-subvolume brick        # name of the remote volume
>>> end-volume
>>>
>>> ### Add client feature and attach to remote subvolume of server2
>>> volume distfs02
>>> type protocol/client
>>> option transport-type tcp/client     # for TCP/IP transport
>>> option remote-host 10.8.101.52      # IP address of the remote brick
>>> option remote-subvolume brick        # name of the remote volume
>>> end-volume
>>>
>>> volume distfs03
>>> type protocol/client
>>> option transport-type tcp/client
>>> option remote-host 10.8.101.53
>>> option remote-subvolume brick
>>> end-volume
>>>
>>> volume distfs04
>>> type protocol/client
>>> option transport-type tcp/client
>>> option remote-host 10.8.101.54
>>> option remote-subvolume brick
>>> end-volume
>>>
>>> volume stripe0
>>> type cluster/stripe
>>> option block-size *.gff:1KB,*.nt:1KB,*.best: 
>>> 1KB,*.txt3:1KB,*.nbest.info:1
>>> KB*:1MB
>>> option scheduler alu
>>> option alu.order read-usage:write-usage:disk-usage
>>> option alu.read-usage.entry-threshold 20%
>>> option alu.read-usage.exit-threshold 4%
>>> option alu.write-usage.entry-threshold 20%
>>> option alu.write-usage.exit-threshold 4%
>>> option alu.disk-usage.entry-threshold 2GB
>>> option alu.disk-usage.exit-threshold 100MB
>>> subvolumes distfs01 distfs02 distfs03 distfs04
>>> end-volume
>>>
>>> volume ioc  type performance/io-cache  subvolumes stripe0          
>>> # In
>>> this
>>> example it is 'client...
>>> volume fixed
>>> type features/fixed-id
>>> option fixed-uid 0
>>> option fixed-gid 900
>>> subvolumes ioc
>>> end-volume
>>>
>>> Dan Parsons
>>>
>>> On Dec 17, 2008, at 2:09 PM, Anand Avati wrote: > Dan, > Is it  
>>> feasible
>>> for
>>> you to try 1.4.0pre4...
>>
>>
>