[Gluster-users] Performance

Wed May 4 05:51:17 UTC 2011

On Wednesday 04 May 2011 12:14 AM, Mohit Anchlia wrote:
> Does anyone know if new controller cards can be replaced without
> re-installing OS?

If your root disk is on that controller card, you might hit issues with 
device paths when you move the root disk onto a new card. You are better 
off not doing that.
Otherwise, it is routine task - replacing cards on a server. And, if it 
is hot plug capable servers, you don't event have to reboot the OS.

Pavan

>
> On Wed, Apr 27, 2011 at 11:59 AM, Burnash, James<jburnash at knight.com>  wrote:
>> We use HP controllers here - p800, p812. They're pretty good - but I believe they're fairly pricey (my sources say $600-$800 for the p812, depending on the options for battery and cache.
>>
>> I use these controllers on my Gluster backend storage servers. Then again, we're an HP shop.
>>
>> James Burnash, Unix Engineering
>>
>> -----Original Message-----
>> From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Mohit Anchlia
>> Sent: Wednesday, April 27, 2011 2:47 PM
>> To: landman at scalableinformatics.com
>> Cc: gluster-users at gluster.org
>> Subject: Re: [Gluster-users] Performance
>>
>> What are some of the good controller cards would you recommend for SAS
>> drives? Dell and Areca is what I am seeing most suggested online.
>>
>> On Tue, Apr 26, 2011 at 4:43 PM, Mohit Anchlia<mohitanchlia at gmail.com>  wrote:
>>> In your experience does it really help having journal on different
>>> disk? Just trying to see if it's worth the effort. Also, Gluster also
>>> recommends creating mkfs with larger blocks mkfs -I 256
>>>
>>> As always thanks for the suggestion.
>>>
>>> On Tue, Apr 26, 2011 at 4:31 PM, Joe Landman
>>> <landman at scalableinformatics.com>  wrote:
>>>> On 04/26/2011 05:48 PM, Mohit Anchlia wrote:
>>>>>
>>>>> I am not sure how valid this performance url is
>>>>>
>>>>>
>>>>> http://www.gluster.com/community/documentation/index.php/Guide_to_Optimizing_GlusterFS
>>>>>
>>>>> Does it make sense to separate out the journal and create mkfs -I 256?
>>>>>
>>>>> Also, if I already have a file system on a different partition can I
>>>>> still use it to store journal from other partition without corrupting
>>>>> the file system?
>>>>
>>>> Journals are small write heavy.  You really want a raw device for them.  You
>>>> do not want file system caching underneath them.
>>>>
>>>> Raw partition for an external journal is best.  Also, understand that ext*
>>>> suffers badly under intense parallel loads.  Keep that in mind as you make
>>>> your file system choice.
>>>>
>>>>>
>>>>> On Thu, Apr 21, 2011 at 7:23 PM, Joe Landman
>>>>> <landman at scalableinformatics.com>    wrote:
>>>>>>
>>>>>> On 04/21/2011 08:49 PM, Mohit Anchlia wrote:
>>>>>>>
>>>>>>> After lot of digging today finaly figured out that it's not really
>>>>>>> using PERC controller but some Fusion MPT. Then it wasn't clear which
>>>>>>
>>>>>> PERC is a rebadged LSI based on the 1068E chip.
>>>>>>
>>>>>>> tool it supports. Finally I installed lsiutil and was able to change
>>>>>>> the cache size.
>>>>>>>
>>>>>>> [root at dsdb1 ~]# lspci|grep LSI
>>>>>>> 02:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E
>>>>>>> PCI-Express Fusion-MPT SAS (rev 08)
>>>>>>
>>>>>>   This looks like PERC.  These are roughly equivalent to the LSI 3081
>>>>>> series.
>>>>>>   These are not fast units.  There is a variant of this that does RAID6,
>>>>>> its
>>>>>> usually available as a software update or plugin module (button?) to
>>>>>> this.
>>>>>>   I might be thinking of the 1078 chip though.
>>>>>>
>>>>>>   Regardless, these are fairly old designs.
>>>>>>
>>>>>>
>>>>>>> [root at dsdb1 ~]# dd if=/dev/zero of=/data/big.file bs=128k count=40k
>>>>>>> oflag=direct
>>>>>>> 1024+0 records in
>>>>>>> 1024+0 records out
>>>>>>> 134217728 bytes (134 MB) copied, 0.742517 seconds, 181 MB/s
>>>>>>>
>>>>>>> I compared this with SW RAID mdadm that I created yesterday on one of
>>>>>>> the servers and I get around 300MB/s. I will test out first with what
>>>>>>> we have before destroying and testing with mdadm.
>>>>>>
>>>>>> So the software RAID is giving you 300 MB/s and the hardware 'RAID' is
>>>>>> giving you ~181 MB/s?  Seems a pretty simple choice :)
>>>>>>
>>>>>> BTW: The 300MB/s could also be a limitation of the PCIe channel
>>>>>> interconnect
>>>>>> (or worse, if they hung the chip off a PCIx bridge).  The motherboard
>>>>>> vendors are generally loathe to put more than a few PCIe lanes for
>>>>>> handling
>>>>>> SATA, Networking, etc.  So typically you wind up with very low powered
>>>>>> 'RAID' and 'SATA/SAS' on the motherboard, connected by PCIe x2 or x4 at
>>>>>> most.  A number of motherboards have NICs that are served by a single
>>>>>> PCIe
>>>>>> x1 link.
>>>>>>
>>>>>>> Thanks for your help that led me to this path. Another question I had
>>>>>>> was when creating mdadm RAID does it make sense to use multipathing?
>>>>>>
>>>>>> Well, for a shared backend over a fabric, I'd say possibly.  For an
>>>>>> internal
>>>>>> connected set, I'd say no.  Given what you are doing with Gluster, I'd
>>>>>> say
>>>>>> that the additional expense/pain of setting up a multipath scenario
>>>>>> probably
>>>>>> isn't worth it.
>>>>>>
>>>>>> Gluster lets you get many of these benefits at a higher level in the
>>>>>> stack.
>>>>>>   Which to a degree, and in some use cases, obviates the need for
>>>>>> multipathing at a lower level.  I'd still suggest real RAID at the lower
>>>>>> level (RAID6, and sometimes RAID10 make the most sense) for the backing
>>>>>> store.
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Joseph Landman, Ph.D
>>>>>> Founder and CEO
>>>>>> Scalable Informatics, Inc.
>>>>>> email: landman at scalableinformatics.com
>>>>>> web  : http://scalableinformatics.com
>>>>>>        http://scalableinformatics.com/sicluster
>>>>>> phone: +1 734 786 8423 x121
>>>>>> fax  : +1 866 888 3112
>>>>>> cell : +1 734 612 4615
>>>>>>
>>>>
>>>>
>>>> --
>>>> Joseph Landman, Ph.D
>>>> Founder and CEO
>>>> Scalable Informatics, Inc.
>>>> email: landman at scalableinformatics.com
>>>> web  : http://scalableinformatics.com
>>>>        http://scalableinformatics.com/sicluster
>>>> phone: +1 734 786 8423 x121
>>>> fax  : +1 866 888 3112
>>>> cell : +1 734 612 4615
>>>>
>>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
>>
>> DISCLAIMER:
>> This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently delete the original and any copy of any e-mail and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission.
>> NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications. http://www.knight.com
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users