[Gluster-users] Hopefully answering some mirroring questions asked here and offline
Joe Landman
landman at scalableinformatics.com
Mon May 2 22:08:05 UTC 2011
Hi folks
We've fielded a number of mirroring questions offline as well as
watched/participated in discussions here. I thought it was important to
make sure some of these are answered and searchable on the lists.
One major question that kept arising was as follows:
q: If I have a large image file (say a VM vmdk/other format) on a
mirrored volume, will one small change of a few bytes result in a resync
of the entire file?
a: No.
To test this, we created a 20GB file on a mirror volume.
root at metal:/local2/home/landman# ls -alF /mirror1gfs/big.file
-rw-r--r-- 1 root root 21474836490 2011-05-02 12:44 /mirror1gfs/big.file
Then using the following quick and dirty Perl, we appended about 10-20
bytes to the file.
#!/usr/bin/env perl
my $file=shift;
my $fh;
open($fh,">>".$file);
print $fh "end ".$$."\n";
close($fh);
root at metal:/local2/home/landman# ./app.pl /mirror1gfs/big.file
then I had to write a quick and dirty tail replacement, as I've
discovered that tail doesn't seek ... (yeah, it started reading every
'line' of that file ...)
#!/usr/bin/env perl
my $file=shift;
my $fh;
my $buf;
open($fh,"<".$file);
seek $fh,-200,2;
read $fh,$buf,200;
printf "buffer: \'%s\'\n",$buf;
close($fh);
root at metal:/local2/home/landman# ./tail.pl /mirror1gfs/big.file
buffer: 'end 19362'
While running the app.pl, I did not see any massive resyncs. I had
dstat running in another window.
You might say, that this is irrelevant, as we only appended, and that
could be special cased.
So I wrote a random updater, that updated at random spots throughtout
the large file (sorta like a VM vmdk and other files).
#!/usr/bin/env perl
my $file=shift;
my $fh;
my $buf;
my @stat;
my $loc;
@stat = stat($file);
$loc = int(rand($stat[7]));
open($fh,">>+".$file);
seek $fh,$loc,0;
printf $fh "I was here!!!";
printf "loc: %i\n",$loc;
close($fh);
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 17598205436
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 16468787891
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 9271612568
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 1356667302
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 12365324308
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 15654714313
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 10127739152
root at metal:/local2/home/landman# ./randupd.pl /mirror1gfs/big.file
loc: 10259920623
and again, no massive resyncs.
So I think its fairly safe to say that the concern over massive resyncs
for small updates is not something we see in the field.
Regards,
Joe
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Gluster-users
mailing list