[Gluster-users] Performance

Tue Apr 26 23:43:50 UTC 2011

In your experience does it really help having journal on different
disk? Just trying to see if it's worth the effort. Also, Gluster also
recommends creating mkfs with larger blocks mkfs -I 256

As always thanks for the suggestion.

On Tue, Apr 26, 2011 at 4:31 PM, Joe Landman
<landman at scalableinformatics.com> wrote:
> On 04/26/2011 05:48 PM, Mohit Anchlia wrote:
>>
>> I am not sure how valid this performance url is
>>
>>
>> http://www.gluster.com/community/documentation/index.php/Guide_to_Optimizing_GlusterFS
>>
>> Does it make sense to separate out the journal and create mkfs -I 256?
>>
>> Also, if I already have a file system on a different partition can I
>> still use it to store journal from other partition without corrupting
>> the file system?
>
> Journals are small write heavy.  You really want a raw device for them.  You
> do not want file system caching underneath them.
>
> Raw partition for an external journal is best.  Also, understand that ext*
> suffers badly under intense parallel loads.  Keep that in mind as you make
> your file system choice.
>
>>
>> On Thu, Apr 21, 2011 at 7:23 PM, Joe Landman
>> <landman at scalableinformatics.com>  wrote:
>>>
>>> On 04/21/2011 08:49 PM, Mohit Anchlia wrote:
>>>>
>>>> After lot of digging today finaly figured out that it's not really
>>>> using PERC controller but some Fusion MPT. Then it wasn't clear which
>>>
>>> PERC is a rebadged LSI based on the 1068E chip.
>>>
>>>> tool it supports. Finally I installed lsiutil and was able to change
>>>> the cache size.
>>>>
>>>> [root at dsdb1 ~]# lspci|grep LSI
>>>> 02:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E
>>>> PCI-Express Fusion-MPT SAS (rev 08)
>>>
>>>  This looks like PERC.  These are roughly equivalent to the LSI 3081
>>> series.
>>>  These are not fast units.  There is a variant of this that does RAID6,
>>> its
>>> usually available as a software update or plugin module (button?) to
>>> this.
>>>  I might be thinking of the 1078 chip though.
>>>
>>>  Regardless, these are fairly old designs.
>>>
>>>
>>>> [root at dsdb1 ~]# dd if=/dev/zero of=/data/big.file bs=128k count=40k
>>>> oflag=direct
>>>> 1024+0 records in
>>>> 1024+0 records out
>>>> 134217728 bytes (134 MB) copied, 0.742517 seconds, 181 MB/s
>>>>
>>>> I compared this with SW RAID mdadm that I created yesterday on one of
>>>> the servers and I get around 300MB/s. I will test out first with what
>>>> we have before destroying and testing with mdadm.
>>>
>>> So the software RAID is giving you 300 MB/s and the hardware 'RAID' is
>>> giving you ~181 MB/s?  Seems a pretty simple choice :)
>>>
>>> BTW: The 300MB/s could also be a limitation of the PCIe channel
>>> interconnect
>>> (or worse, if they hung the chip off a PCIx bridge).  The motherboard
>>> vendors are generally loathe to put more than a few PCIe lanes for
>>> handling
>>> SATA, Networking, etc.  So typically you wind up with very low powered
>>> 'RAID' and 'SATA/SAS' on the motherboard, connected by PCIe x2 or x4 at
>>> most.  A number of motherboards have NICs that are served by a single
>>> PCIe
>>> x1 link.
>>>
>>>> Thanks for your help that led me to this path. Another question I had
>>>> was when creating mdadm RAID does it make sense to use multipathing?
>>>
>>> Well, for a shared backend over a fabric, I'd say possibly.  For an
>>> internal
>>> connected set, I'd say no.  Given what you are doing with Gluster, I'd
>>> say
>>> that the additional expense/pain of setting up a multipath scenario
>>> probably
>>> isn't worth it.
>>>
>>> Gluster lets you get many of these benefits at a higher level in the
>>> stack.
>>>  Which to a degree, and in some use cases, obviates the need for
>>> multipathing at a lower level.  I'd still suggest real RAID at the lower
>>> level (RAID6, and sometimes RAID10 make the most sense) for the backing
>>> store.
>>>
>>>
>>> --
>>> Joseph Landman, Ph.D
>>> Founder and CEO
>>> Scalable Informatics, Inc.
>>> email: landman at scalableinformatics.com
>>> web  : http://scalableinformatics.com
>>>       http://scalableinformatics.com/sicluster
>>> phone: +1 734 786 8423 x121
>>> fax  : +1 866 888 3112
>>> cell : +1 734 612 4615
>>>
>
>
> --
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics, Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>       http://scalableinformatics.com/sicluster
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
>