<div dir="auto">If the  entire gluster volume failed, I&#39;d wipe it, setup a fresh master volume &amp; then copy the VM DR images onto the new volume. To restart each VM after it&#39;s been restored, I&#39;d setup a script to connect to the hypervisor&#39;s API.<div dir="auto"><br></div><div dir="auto">Of course, at the level you&#39;re speaking of, it could take a fair amount of time before the last VM is restored.</div><div dir="auto">As long as you&#39;ve followed a naming standard, you could easily script in a restore queue based on service priority.</div><div dir="auto"><br></div><div dir="auto">If you need something quicker than that, then you&#39;ve got little choice but to go down the HA-with-a-big-fat-pipe route.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On 23 Mar 2017 18:46, &quot;Gandalf Corvotempesta&quot; &lt;<a href="mailto:gandalf.corvotempesta@gmail.com">gandalf.corvotempesta@gmail.com</a>&gt; wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The problem is not how to backup, but how to restore.<br>

How do you restore a whole cluster made of thousands of VMs ?<br>

<br>

If you move all VMs to a shared storage like gluster, you should<br>

consider how to recover everything from the gluster failure.<br>

If you had a bounch of VMs on each server with local disks, you had to<br>

recover only VMs affected by a single server failure,<br>

but moving everything to a shared storage means to be prepared for a<br>

disaster, where you *must* restore everything or hundreds of TB.<br>

<br>

2017-03-23 23:07 GMT+01:00 Gambit15 &lt;<a href="mailto:dougti%2Bgluster@gmail.com">dougti+gluster@gmail.com</a>&gt;:<br>

&gt; Don&#39;t snapshot the entire gluster volume, keep a rolling routine for<br>

&gt; snapshotting the individual VMs &amp; rsync those.<br>

&gt; As already mentioned, you need to &quot;itemize&quot; the backups - trying to manage<br>

&gt; backups for the whole volume as a single unit is just crazy!<br>

&gt;<br>

&gt; Also, for long term backups, maintaining just the core data of each VM is<br>

&gt; far more manageable.<br>

&gt;<br>

&gt; I settled on oVirt for our platform, and do the following...<br>

&gt;<br>

&gt; A cronjob regularly snapshots &amp; clones each VM, whose image is then rsynced<br>

&gt; to our backup storage;<br>

&gt; The backup server snapshots the VM&#39;s image backup volume to maintain<br>

&gt; history/versioning;<br>

&gt; These full images are only maintained for 30 days, for DR purposes;<br>

&gt; A separate routine rsyncs the VM&#39;s core data to its own data backup volume,<br>

&gt; which is snapshotted &amp; maintained for 10 years;<br>

&gt;<br>

&gt; This could be made more efficient by using guestfish to extract the core<br>

&gt; data from backup image, instead of basically rsyncing the data across the<br>

&gt; network twice.<br>

&gt;<br>

&gt; That active storage layer uses Gluster on top of XFS &amp; LVM. The backup<br>

&gt; storage layer uses a mirrored storage unit running ZFS on FreeNAS.<br>

&gt; This of course doesn&#39;t allow for HA in the case of the entire cloud failing.<br>

&gt; For that we&#39;d use geo-rep &amp; a big fat pipe.<br>

&gt;<br>

&gt; D<br>

&gt;<br>

&gt; On 23 March 2017 at 16:29, Gandalf Corvotempesta<br>

&gt; &lt;<a href="mailto:gandalf.corvotempesta@gmail.com">gandalf.corvotempesta@gmail.<wbr>com</a>&gt; wrote:<br>

&gt;&gt;<br>

&gt;&gt; Yes but the biggest issue is how to recover<br>

&gt;&gt; You&#39;ll need to recover the whole storage not a single snapshot and this<br>

&gt;&gt; can last for days<br>

&gt;&gt;<br>

&gt;&gt; Il 23 mar 2017 9:24 PM, &quot;Alvin Starr&quot; &lt;<a href="mailto:alvin@netvel.net">alvin@netvel.net</a>&gt; ha scritto:<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; For volume backups you need something like snapshots.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; If you take a snapshot A of a live volume L that snapshot stays at that<br>

&gt;&gt;&gt; moment in time and you can rsync that to another system or use something<br>

&gt;&gt;&gt; like <a href="http://deltacp.pl" rel="noreferrer" target="_blank">deltacp.pl</a> to copy it.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; The usual process is to delete the snapshot once its copied and than<br>

&gt;&gt;&gt; repeat the process again when the next backup is required.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; That process does require rsync/deltacp to read the complete volume on<br>

&gt;&gt;&gt; both systems which can take a long time.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; I was kicking around the idea to try and handle snapshot deltas better.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; The idea is that you could take your initial snapshot A then sync that<br>

&gt;&gt;&gt; snapshot to your backup system.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; At a later point you could take another snapshot B.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; Because snapshots contain the copies of the original data at the time of<br>

&gt;&gt;&gt; the snapshot and unmodified data points to the Live volume it is possible to<br>

&gt;&gt;&gt; tell what blocks of data have changed since the snapshot was taken.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; Now that you have a second snapshot you can in essence perform a diff on<br>

&gt;&gt;&gt; the A and B snapshots to get only the blocks that changed up to the time<br>

&gt;&gt;&gt; that B was taken.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; These blocks could be copied to the backup image and you should have a<br>

&gt;&gt;&gt; clone of the B snapshot.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; You would not have to read the whole volume image but just the changed<br>

&gt;&gt;&gt; blocks dramatically improving the speed of the backup.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; At this point you can delete the A snapshot and promote the B snapshot to<br>

&gt;&gt;&gt; be the A snapshot for the next backup round.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; On 03/23/2017 03:53 PM, Gandalf Corvotempesta wrote:<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; Are backup consistent?<br>

&gt;&gt;&gt; What happens if the header on shard0 is synced referring to some data on<br>

&gt;&gt;&gt; shard450 and when rsync parse shard450 this data is changed by subsequent<br>

&gt;&gt;&gt; writes?<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; Header would be backupped  of sync respect the rest of the image<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; Il 23 mar 2017 8:48 PM, &quot;Joe Julian&quot; &lt;<a href="mailto:joe@julianfamily.org">joe@julianfamily.org</a>&gt; ha scritto:<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; The rsync protocol only passes blocks that have actually changed. Raw<br>

&gt;&gt;&gt;&gt; changes fewer bits. You&#39;re right, though, that it still has to check the<br>

&gt;&gt;&gt;&gt; entire file for those changes.<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; On 03/23/17 12:47, Gandalf Corvotempesta wrote:<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; Raw or qcow doesn&#39;t change anything about the backup.<br>

&gt;&gt;&gt;&gt; Georep always have to sync the whole file<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; Additionally, raw images has much less features than qcow<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; Il 23 mar 2017 8:40 PM, &quot;Joe Julian&quot; &lt;<a href="mailto:joe@julianfamily.org">joe@julianfamily.org</a>&gt; ha scritto:<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; I always use raw images. And yes, sharding would also be good.<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; On 03/23/17 12:36, Gandalf Corvotempesta wrote:<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; Georep expose to another problem:<br>

&gt;&gt;&gt;&gt;&gt; When using gluster as storage for VM, the VM file is saved as qcow.<br>

&gt;&gt;&gt;&gt;&gt; Changes are inside the qcow, thus rsync has to sync the whole file every<br>

&gt;&gt;&gt;&gt;&gt; time<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; A little workaround would be sharding, as rsync has to sync only the<br>

&gt;&gt;&gt;&gt;&gt; changed shards, but I don&#39;t think this is a good solution<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; Il 23 mar 2017 8:33 PM, &quot;Joe Julian&quot; &lt;<a href="mailto:joe@julianfamily.org">joe@julianfamily.org</a>&gt; ha scritto:<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; In many cases, a full backup set is just not feasible. Georep to the<br>

&gt;&gt;&gt;&gt;&gt;&gt; same or different DC may be an option if the bandwidth can keep up with the<br>

&gt;&gt;&gt;&gt;&gt;&gt; change set. If not, maybe breaking the data up into smaller more manageable<br>

&gt;&gt;&gt;&gt;&gt;&gt; volumes where you only keep a smaller set of critical data and just back<br>

&gt;&gt;&gt;&gt;&gt;&gt; that up. Perhaps an object store (swift?) might handle fault tolerance<br>

&gt;&gt;&gt;&gt;&gt;&gt; distribution better for some workloads.<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; There&#39;s no one right answer.<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; On 03/23/17 12:23, Gandalf Corvotempesta wrote:<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; Backing up from inside each VM doesn&#39;t solve the problem<br>

&gt;&gt;&gt;&gt;&gt;&gt; If you have to backup 500VMs you just need more than 1 day and what if<br>

&gt;&gt;&gt;&gt;&gt;&gt; you have to restore the whole gluster storage?<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; How many days do you need to restore 1PB?<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; Probably the only solution should be a georep in the same<br>

&gt;&gt;&gt;&gt;&gt;&gt; datacenter/rack with a similiar cluster,<br>

&gt;&gt;&gt;&gt;&gt;&gt; ready to became the master storage.<br>

&gt;&gt;&gt;&gt;&gt;&gt; In this case you don&#39;t need to restore anything as data are already<br>

&gt;&gt;&gt;&gt;&gt;&gt; there,<br>

&gt;&gt;&gt;&gt;&gt;&gt; only a little bit back in time but this double the TCO<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; Il 23 mar 2017 6:39 PM, &quot;Serkan Çoban&quot; &lt;<a href="mailto:cobanserkan@gmail.com">cobanserkan@gmail.com</a>&gt; ha<br>

&gt;&gt;&gt;&gt;&gt;&gt; scritto:<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; Assuming a backup window of 12 hours, you need to send data at 25GB/s<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; to backup solution.<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; Using 10G Ethernet on hosts you need at least 25 host to handle<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; 25GB/s.<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; You can create an EC gluster cluster that can handle this rates, or<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; you just backup valuable data from inside VMs using open source<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; backup<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; tools like borg,attic,restic , etc...<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; On Thu, Mar 23, 2017 at 7:48 PM, Gandalf Corvotempesta<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &lt;<a href="mailto:gandalf.corvotempesta@gmail.com">gandalf.corvotempesta@gmail.<wbr>com</a>&gt; wrote:<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; Let&#39;s assume a 1PB storage full of VMs images with each brick over<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; ZFS,<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; replica 3, sharding enabled<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; How do you backup/restore that amount of data?<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; Backing up daily is impossible, you&#39;ll never finish the backup that<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; the<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; following one is starting (in other words, you need more than 24<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; hours)<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; Restoring is even worse. You need more than 24 hours with the whole<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; cluster<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; down<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; You can&#39;t rely on ZFS snapshot due to sharding (the snapshot took<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; from one<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; node is useless without all other node related at the same shard)<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; and you<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; still have the same restore speed<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; How do you backup this?<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; Even georep isn&#39;t enough, if you have to restore the whole storage<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; in case<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; of disaster<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; ______________________________<wbr>_________________<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; Gluster-users mailing list<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; &gt; <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; ______________________________<wbr>_________________<br>

&gt;&gt;&gt;&gt;&gt;&gt; Gluster-users mailing list<br>

&gt;&gt;&gt;&gt;&gt;&gt; <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

&gt;&gt;&gt;&gt;&gt;&gt; <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; ______________________________<wbr>_________________ Gluster-users mailing<br>

&gt;&gt;&gt;&gt;&gt;&gt; list <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

&gt;&gt;&gt;&gt;&gt;&gt; <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; ______________________________<wbr>_________________<br>

&gt;&gt;&gt; Gluster-users mailing list<br>

&gt;&gt;&gt; <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

&gt;&gt;&gt; <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; --<br>

&gt;&gt;&gt; Alvin Starr                   ||   voice: <a href="tel:%28905%29513-7688" value="+19055137688">(905)513-7688</a><br>

&gt;&gt;&gt; Netvel Inc.                   ||   Cell:  <a href="tel:%28416%29806-0133" value="+14168060133">(416)806-0133</a><br>

&gt;&gt;&gt; <a href="mailto:alvin@netvel.net">alvin@netvel.net</a>              ||<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; ______________________________<wbr>_________________<br>

&gt;&gt;&gt; Gluster-users mailing list<br>

&gt;&gt;&gt; <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

&gt;&gt;&gt; <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; ______________________________<wbr>_________________<br>

&gt;&gt; Gluster-users mailing list<br>

&gt;&gt; <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

&gt;&gt; <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>

&gt;<br>

&gt;<br>

</blockquote></div></div>