[Gluster-users] Gluster-users Digest, Vol 9, Issue 66
Keith Freedman
freedman at FreeFormIT.com
Fri Jan 23 18:56:07 UTC 2009
At 10:18 AM 1/23/2009, Evan wrote:
>I added the following to the bottom of my spec file:
>
>volume writebehind
> type performance/write-behind
> option aggregate-size 10MB # default is 0bytes
> option flush-behind off # default is 'off'
> subvolumes afr
>end-volume
>
>which gives me the following results when making a 10MB file
># time dd if=/dev/zero of=/tmp/disktest count=10240 bs=1024
>10240+0 records in
>10240+0 records out
>10485760 bytes (10 MB) copied, 0.173179 s, 60.5 MB/s
>
>real 0m0.183s
>user 0m0.000s
>sys 0m0.204s
>
># time dd if=/dev/zero of=/mnt/gluster/disktest count=10240 bs=1024
>10240+0 records in
>10240+0 records out
>10485760 bytes (10 MB) copied, 5.50861 s, 1.9 MB/s
>
>real 0m5.720s
>user 0m0.000s
>sys 0m0.060s
>
>Although this is better than I had before is there anyway to have
>gluster write the data to the localBrick and then sync/afr in the
>background so I could expect to see something closer to the 60 MB/s
>I see when writing to the local disk directly?
what you really want is a delayed replication. I've asked for this
in this mailing list recently, and was told that it's something
they've considered (more as a DR feature than an HA feature), but
it's not currently on the list of priorities.
The issue, as I see it, is if it's an HA feature, then you really
need to insure that the data is replicated before you let your
application think the data is written. If the replication was
delayed, and the server went down, the data is lost forever. This is
bad for HA.
if it's a DR feature, then you're probably ok, because usually
disaster recovery scenarios can probably withstand some data loss,
and you're more interested in a point-in-time snapshot of the data.
FUSE is a problem, and TCP/IP is a problem with much overhead and
large blocksizes.
Ideally, glusters write-behind would be smart enough to aggregate
smaller blocks of data into a large write. I think this would solve
a lot of the problem you're having in your tests.
>Thanks
>
>aghavendra G
><<mailto:raghavendra at zresearch.com>raghavendra at zresearch.com> wrote:
>above afr with afr as a subvolume
>
>On Fri, Jan 23, 2009 at 12:59 AM, Evan
><_<mailto:Gluster at devnada.com>Gluster at devnada.com> wrote:
>Where should I put the write-behind translator?
>Just above afr with afr as a subvolume? Or should I put it just
>above my localBrick volume and below afr?
>
>
>Here is the output using /dev/zero:
># time dd if=/dev/zero of=/mnt/gluster/disktest count=1024 bs=1024
>
>1024+0 records in
>1024+0 records out
>1048576 bytes (1.0 MB) copied, 1.90119 s, 552 kB/s
>
>real 0m2.098s
>user 0m0.000s
>sys 0m0.016s
>
># time dd if=/dev/zero of=/tmp/disktest count=1024 bs=1024
>
>1024+0 records in
>1024+0 records out
>1048576 bytes (1.0 MB) copied, 0.0195388 s, 53.7 MB/s
>
>real 0m0.026s
>user 0m0.000s
>sys 0m0.028s
>
>
>Thanks
>
>
>On Thu, Jan 22, 2009 at 12:52 PM, Anand Avati
><<mailto:avati at zresearch.com>avati at zresearch.com> wrote:
>Do you have write-behind loaded on the client side? For IO testing,
>use /dev/zero instead of /dev/urandom.
>
>avati
>
>On Fri, Jan 23, 2009 at 2:14 AM, Evan
><_<mailto:Gluster at devnada.com>Gluster at devnada.com> wrote:
> > I have a 2 node single process AFR setup with 1.544Mbps bandwidth between
> > the 2 nodes. When I write a 1MB file to the gluster share it
> seems to AFR to
> > the other node in real time killing my disk IO speeds on the gluster mount
> > point. Is there anyway to fix this? Ideally I would like to see near real
> > disk IO speeds from/to the local gluster mount point and let the afr play
> > catch up in the background as the bandwidth becomes available.
> >
> > Gluster Spec File (same on both nodes)
> <http://pastebin.com/m58dc49d4>http://pastebin.com/m58dc49d4
> > IO speed tests:
> > # time dd if=/dev/urandom of=/mnt/gluster/disktest count=1024 bs=1024
> > 1024+0 records in
> > 1024+0 records out
> > 1048576 bytes (1.0 MB) copied, 8.34701 s, 126 kB/s
> >
> > real 0m8.547s
> > user 0m0.000s
> > sys 0m0.372s
> >
> > # time dd if=/dev/urandom of=/tmp/disktest count=1024 bs=1024
> > 1024+0 records in
> > 1024+0 records out
> > 1048576 bytes (1.0 MB) copied, 0.253865 s, 4.1 MB/s
> >
> > real 0m0.259s
> > user 0m0.000s
> > sys 0m0.284s
> >
> >
> > Thanks
> >
> > _______________________________________________
> > Gluster-users mailing list
> > <mailto:Gluster-users at gluster.org>Gluster-users at gluster.org
> > http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
> >
> >
>
>
>
>_______________________________________________
>Gluster-users mailing list
><mailto:Gluster-users at gluster.org>Gluster-users at gluster.org
>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
>
>
>
>
>--
>Raghavendra G
>
>
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
More information about the Gluster-users
mailing list