[Gluster-users] Atomic file updates

Wed Feb 12 22:19:52 UTC 2014

> I'm not currently a Gluster user but I'm hoping it's the answer to a
> problem I'm working on.
> 
> I manage a private web site that is basically a reporting tool for
> equipment located at several hundred sites. Each site regularly uploads
> zipped XML files to a cloud based server and this also provides a web
> interface to the data using apache/PHP. The problem I need to solve is
> that with a single server disk I/O has become a bottleneck.
> 
> The plan is to use a load balancer and multiple web servers with a
> 4-node Gluster volume behind to store the data. Data would be replicated
> over 2 nodes.
> 
> The uploaded files are stored and then unzipped ready for reading by the
> web interface code. Each file is unzipped into a temporary file and then
> renamed, e.g.
> 
> file1.xml.zip --unzip--> uniquename.tmp --rename--> file1.xml
> 
> Use of the rename function makes these updates atomic.
> 
> How can I achieve atomic updates in this way using a Gluster volume? My
> understanding is that renaming a file on a Gluster volume causes a link
> file to be created and that clearly wouldn't be appropriate where there
> are frequent updates.

Creating a file with one name and then renaming it to another *might*
cause creation of linkfiles, but I think concerns about linkfiles are
often overblown.  The one extra call to create a linkfile isn't much
compared to those for creating the file, writing into it, and then
renaming it even if the rename is local to one brick.  What really
matters is the performance of the entire sequence, with or without the
linkfile.

That said, there's also a trick you can use to avoid creation of a
linkfile.  Other tools, such as rsync and our own object interface,
use the same write-then-rename idiom.  To serve them, there's an
option called extra-hash-regex that can be used to place files on the
"right" brick according to their final name even though they're created
with another.  Unfortunately, specifying that option via the command line
doesn't seem to work (it creates a malformed volfile) so you have to
mount a bit differently.  For example:

   glusterfs --volfile-server=a_server --volfile-id=a_volume \
   --xlator-option a_volume-dht.extra_hash_regex='(.*+)tmp' \
   /a/mountpoint

The important part is that second line.  That causes any file with a
"tmp" suffix to be hashed and placed as though only the part in the
first parenthesized part of the regex (i.e. without the "tmp") was
there.  Therefore, creating "xxxtmp" and then renaming it to "xxx" is
the same as just creating "xxx" in the first place as far as linkfiles
etc. are concerned.  Note that the excluded part can be anything that
a regex can match, including a unique random number.  If I recall,
rsync uses temp files something like this:

   fubar = .fubar.NNNNNN (where NNNNNNN is a random number)

I know this probably seems a little voodoo-ish, but with a little bit
of experimentation to find the right regex you should be able to avoid
those dreaded linkfiles altogether.