[Gluster-users] Gluster-users Digest, Vol 9, Issue 66

Evan _Gluster at devnada.com
Fri Jan 23 19:06:04 UTC 2009


Ideal what I'm trying to accomplish is to have multiple samba servers
connected with a 1.544Mbps pipe stay in sync with each other. So its
important for me to be able to have near really disk access speeds when
reading and writing to the local gluster Node in the AFR group but as far as
getting the data propagated out to the other nodes I know the 1.544Mbps pipe
can't keep up so I'll take the fastest I can get as long as it doesn't
affect the local performance (which is what I am seeing).



On Fri, Jan 23, 2009 at 10:56 AM, Keith Freedman <freedman at freeformit.com>wrote:

> At 10:18 AM 1/23/2009, Evan wrote:
>
>> I added the following to the bottom of my spec file:
>>
>> volume writebehind
>>  type performance/write-behind
>>  option aggregate-size 10MB # default is 0bytes
>>  option flush-behind off    # default is 'off'
>>  subvolumes afr
>> end-volume
>>
>> which gives me the following results when making a 10MB file
>> # time dd if=/dev/zero of=/tmp/disktest count=10240 bs=1024
>> 10240+0 records in
>> 10240+0 records out
>> 10485760 bytes (10 MB) copied, 0.173179 s, 60.5 MB/s
>>
>> real    0m0.183s
>> user    0m0.000s
>> sys     0m0.204s
>>
>> # time dd if=/dev/zero of=/mnt/gluster/disktest count=10240 bs=1024
>> 10240+0 records in
>> 10240+0 records out
>> 10485760 bytes (10 MB) copied, 5.50861 s, 1.9 MB/s
>>
>> real    0m5.720s
>> user    0m0.000s
>> sys     0m0.060s
>>
>> Although this is better than I had before is there anyway to have gluster
>> write the data to the localBrick and then sync/afr in the background so I
>> could expect to see something closer to the 60 MB/s I see when writing to
>> the local disk directly?
>>
>
> what you really want is a delayed replication.  I've asked for this in this
> mailing list recently, and was told that it's something they've considered
> (more as a DR feature than an HA feature), but it's not currently on the
> list of priorities.
>
> The issue, as I see it, is if it's an HA feature, then you really need to
> insure that the data is replicated before you let your application think the
> data is written.  If the replication was delayed, and the server went down,
> the data is lost forever.  This is bad for HA.
> if it's a DR feature, then you're probably ok, because usually disaster
> recovery scenarios can probably withstand some data loss, and you're more
> interested in a point-in-time snapshot of the data.
>
> FUSE is a problem, and TCP/IP is a problem with much overhead and large
> blocksizes.
>
> Ideally, glusters write-behind would be smart enough to aggregate smaller
> blocks of data into a large write.  I think this would solve a lot of the
> problem you're having in your tests.
>
>  Thanks
>>
>> aghavendra G <<mailto:raghavendra at zresearch.com>raghavendra at zresearch.com>
>> wrote:
>> above afr with afr as a subvolume
>>
>> On Fri, Jan 23, 2009 at 12:59 AM, Evan <_<mailto:Gluster at devnada.com>
>> Gluster at devnada.com> wrote:
>> Where should I put the write-behind translator?
>> Just above afr with afr as a subvolume? Or should I put it just above my
>> localBrick volume and below afr?
>>
>>
>> Here is the output using /dev/zero:
>> # time dd if=/dev/zero of=/mnt/gluster/disktest count=1024 bs=1024
>>
>> 1024+0 records in
>> 1024+0 records out
>> 1048576 bytes (1.0 MB) copied, 1.90119 s, 552 kB/s
>>
>> real    0m2.098s
>> user    0m0.000s
>> sys     0m0.016s
>>
>> # time dd if=/dev/zero of=/tmp/disktest count=1024 bs=1024
>>
>> 1024+0 records in
>> 1024+0 records out
>> 1048576 bytes (1.0 MB) copied, 0.0195388 s, 53.7 MB/s
>>
>> real    0m0.026s
>> user    0m0.000s
>> sys     0m0.028s
>>
>>
>> Thanks
>>
>>
>> On Thu, Jan 22, 2009 at 12:52 PM, Anand Avati <<mailto:
>> avati at zresearch.com>avati at zresearch.com> wrote:
>> Do you have write-behind loaded on the client side? For IO testing,
>> use /dev/zero instead of /dev/urandom.
>>
>> avati
>>
>> On Fri, Jan 23, 2009 at 2:14 AM, Evan <_<mailto:Gluster at devnada.com>
>> Gluster at devnada.com> wrote:
>> > I have a 2 node single process AFR setup with 1.544Mbps bandwidth
>> between
>> > the 2 nodes. When I write a 1MB file to the gluster share it seems to
>> AFR to
>> > the other node in real time killing my disk IO speeds on the gluster
>> mount
>> > point. Is there anyway to fix this? Ideally I would like to see near
>> real
>> > disk IO speeds from/to the local gluster mount point and let the afr
>> play
>> > catch up in the background as the bandwidth becomes available.
>> >
>> > Gluster Spec File (same on both nodes) <http://pastebin.com/m58dc49d4>
>> http://pastebin.com/m58dc49d4
>> > IO speed tests:
>> > # time dd if=/dev/urandom of=/mnt/gluster/disktest count=1024 bs=1024
>> > 1024+0 records in
>> > 1024+0 records out
>> > 1048576 bytes (1.0 MB) copied, 8.34701 s, 126 kB/s
>> >
>> > real    0m8.547s
>> > user    0m0.000s
>> > sys     0m0.372s
>> >
>> > # time dd if=/dev/urandom of=/tmp/disktest count=1024 bs=1024
>> > 1024+0 records in
>> > 1024+0 records out
>> > 1048576 bytes (1.0 MB) copied, 0.253865 s, 4.1 MB/s
>> >
>> > real    0m0.259s
>> > user    0m0.000s
>> > sys     0m0.284s
>> >
>> >
>> > Thanks
>> >
>> > _______________________________________________
>> > Gluster-users mailing list
>> > <mailto:Gluster-users at gluster.org>Gluster-users at gluster.org
>> > http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>> >
>> >
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> <mailto:Gluster-users at gluster.org>Gluster-users at gluster.org
>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>>
>>
>>
>>
>> --
>> Raghavendra G
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090123/5f78725f/attachment.html>


More information about the Gluster-users mailing list