[Gluster-users] What is the recommended backup strategy for GlusterFS?

David Robinson david.robinson at corvidtec.com
Mon Oct 26 17:49:30 UTC 2015


Aravinda,

I was testing glusterfind and wondering if you could provide some 
feedback.

My system is RH7.1 and I am using gluster 3.7.5.  My setup for testing 
is a single brick with the parameters shown below...
I was testing glusterfind by copying over my source code and then 
running 'glusterfind pre' (code is ~140,000 files).  The results of the 
test is that "glusterfind pre" took over an hour to process these 
140,000 files and sat at 100% cpu-utilization for the extent of the run. 
  Is this expected and is this the expected rate for "glusterfind pre" to 
process files?

The reason I am asking is because my production gluster system sees 
approximately 2-million files changes per day.  At this pace, 
glusterfind cannot process the requests fast enough to keep up.

I also went back and tested file deletion through a removal of this 
directory.  Looking at the 
/usr/var/lib/misc/glusterfsd/glusterfind/backup/gfs
/tmp_output_0 file, it looks like it is only processing 1000-files per 
hour for file deletions.


[root at ff01bkp gfs]# gluster volume info
Volume Name: gfs
Type: Distribute
Volume ID: 7bbdfcf8-1801-4a2a-9233-0a3261cbcba7
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: ffib01bkp:/data/brick01/gfs
Options Reconfigured:
diagnostics.client-log-level: WARNING
diagnostics.brick-log-level: WARNING
server.allow-insecure: on
performance.readdir-ahead: on
storage.build-pgfid: on
changelog.changelog: on
changelog.capture-del-path: on
changelog.rollover-time: 90
changelog.fsync-interval: 30
client.event-threads: 8
server.event-threads: 8

------ Original Message ------
From: "Aravinda" <avishwan at redhat.com>
To: "Mathieu Chateau" <mathieu.chateau at lotp.fr>; "M S Vishwanath Bhat" 
<msvbhat at gmail.com>
Cc: "gluster-users" <gluster-users at gluster.org>
Sent: 9/7/2015 2:02:09 AM
Subject: Re: [Gluster-users] What is the recommended backup strategy for 
GlusterFS?

>We have one more tool. glusterfind!
>
>This tool comes with gluster installaton, if you are using Gluster 3.7. 
>  glusterfind enables Changelogging(Journal) to Gluster Volume and uses 
>that information to detect the changes happened in the Volume.
>
>1. Create a glusterfind session using, glusterfind create 
><SESSION_NAME> <VOLUME_NAME>
>2. Do a full backup.
>3. Run glusterfind pre command to generate the output file with the 
>list of changes happened in Gluster Volume after glusterfind create. 
>For usage information glusterfind pre --help
>4. Consume that output file and backup only the files listed in output 
>file.
>5. After consuming the output file, run glusterfind post command. 
>(glusterfind post --help)
>
>Doc link: 
>http://gluster.readthedocs.org/en/latest/GlusterFS%20Tools/glusterfind/index.html
>
>This tool is newly released with Gluster release 3.7, please report 
>issues or request for features here 
>https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
>regards Aravinda
>On 09/06/2015 12:37 AM, Mathieu Chateau wrote:
>>Hello,
>>
>>for my needs, it's about having a simple "photo" of files present 5 
>>days ago for example.
>>But i do not want to store file data twice, as most file didn't 
>>change.
>>Using snapshot is convenient of course, but it's risky as you loose 
>>both data and snapshot in case of failure (snapshot only contains 
>>delta blocks).
>>Rsync with hardlink is more resistant (inode stay until last reference 
>>is removed)
>>
>>But interested to hear about production setup relying on it
>>
>>Cordialement,
>>Mathieu CHATEAU
>>http://www.lotp.fr
>>
>>2015-09-05 21:03 GMT+02:00 M S Vishwanath Bhat <msvbhat at gmail.com>:
>>>MS
>>>On 5 Sep 2015 12:57 am, "Mathieu Chateau" <mathieu.chateau at lotp.fr> 
>>>wrote:
>>> >
>>> > Hello,
>>> >
>>> > so far I use rsnapshot. This script do rsync with rotation, and 
>>>most important same files are stored only once through hard link 
>>>(inode). I save space, but still rsync need to parse all folders to 
>>>know for new files.
>>> >
>>> > I am also interested in solution 1), but need to be stored on 
>>>distinct drives/servers. We can't afford to loose data and snapshot 
>>>in case of human error or disaster.
>>> >
>>> >
>>> >
>>> > Cordialement,
>>> > Mathieu CHATEAU
>>> > http://www.lotp.fr
>>> >
>>> > 2015-09-03 13:05 GMT+02:00 Merlin Morgenstern 
>>><merlin.morgenstern at gmail.com>:
>>> >>
>>> >> I have about 1M files in a GlusterFS with rep 2 on 3 nodes runnnig 
>>>gluster 3.7.3.
>>> >>
>>> >> What would be a recommended automated backup strategy for this 
>>>setup?
>>> >>
>>> >> I already considered the following:
>>>
>>>Have you considered glusterfs geo-rep? It's actually for disaster 
>>>recovery. But might suit your backup use case as well.
>>>
>>>My two cents
>>>
>>>//MS
>>>
>>> >>
>>> >> 1) glusterfs snapshots in combination with dd. This unfortunatelly 
>>>was not possible so far as I could not find any info on how to make a 
>>>image file out of the snapshots and how to automate the snapshot 
>>>procedure.
>>> >>
>>> >> 2) rsync the mounted file share to a second directory and do a tar 
>>>on the entire directory after rsync completed
>>> >>
>>> >> 3) combination of 1 and 2. Doing a snapshot that gets mounted 
>>>automaticaly and then rsync from there. Problem: How to automate 
>>>snapshots and how to know the mount path
>>> >>
>>> >> Currently I am only able to do the second option, but the fist 
>>>option seems to be the most atractive.
>>> >>
>>> >> Thank you for any help on this.
>>> >>
>>> >> _______________________________________________
>>> >> Gluster-users mailing list
>>> >> Gluster-users at gluster.org
>>> >> http://www.gluster.org/mailman/listinfo/gluster-users
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Gluster-users mailing list
>>> > Gluster-users at gluster.org
>>> > http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>>
>>_______________________________________________ Gluster-users mailing 
>>list 
>>Gluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151026/46d1482a/attachment.html>


More information about the Gluster-users mailing list