<div dir="auto">If the entire gluster volume failed, I'd wipe it, setup a fresh master volume & then copy the VM DR images onto the new volume. To restart each VM after it's been restored, I'd setup a script to connect to the hypervisor's API.<div dir="auto"><br></div><div dir="auto">Of course, at the level you're speaking of, it could take a fair amount of time before the last VM is restored.</div><div dir="auto">As long as you've followed a naming standard, you could easily script in a restore queue based on service priority.</div><div dir="auto"><br></div><div dir="auto">If you need something quicker than that, then you've got little choice but to go down the HA-with-a-big-fat-pipe route.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On 23 Mar 2017 18:46, "Gandalf Corvotempesta" <<a href="mailto:gandalf.corvotempesta@gmail.com">gandalf.corvotempesta@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The problem is not how to backup, but how to restore.<br>
How do you restore a whole cluster made of thousands of VMs ?<br>
<br>
If you move all VMs to a shared storage like gluster, you should<br>
consider how to recover everything from the gluster failure.<br>
If you had a bounch of VMs on each server with local disks, you had to<br>
recover only VMs affected by a single server failure,<br>
but moving everything to a shared storage means to be prepared for a<br>
disaster, where you *must* restore everything or hundreds of TB.<br>
<br>
2017-03-23 23:07 GMT+01:00 Gambit15 <<a href="mailto:dougti%2Bgluster@gmail.com">dougti+gluster@gmail.com</a>>:<br>
> Don't snapshot the entire gluster volume, keep a rolling routine for<br>
> snapshotting the individual VMs & rsync those.<br>
> As already mentioned, you need to "itemize" the backups - trying to manage<br>
> backups for the whole volume as a single unit is just crazy!<br>
><br>
> Also, for long term backups, maintaining just the core data of each VM is<br>
> far more manageable.<br>
><br>
> I settled on oVirt for our platform, and do the following...<br>
><br>
> A cronjob regularly snapshots & clones each VM, whose image is then rsynced<br>
> to our backup storage;<br>
> The backup server snapshots the VM's image backup volume to maintain<br>
> history/versioning;<br>
> These full images are only maintained for 30 days, for DR purposes;<br>
> A separate routine rsyncs the VM's core data to its own data backup volume,<br>
> which is snapshotted & maintained for 10 years;<br>
><br>
> This could be made more efficient by using guestfish to extract the core<br>
> data from backup image, instead of basically rsyncing the data across the<br>
> network twice.<br>
><br>
> That active storage layer uses Gluster on top of XFS & LVM. The backup<br>
> storage layer uses a mirrored storage unit running ZFS on FreeNAS.<br>
> This of course doesn't allow for HA in the case of the entire cloud failing.<br>
> For that we'd use geo-rep & a big fat pipe.<br>
><br>
> D<br>
><br>
> On 23 March 2017 at 16:29, Gandalf Corvotempesta<br>
> <<a href="mailto:gandalf.corvotempesta@gmail.com">gandalf.corvotempesta@gmail.<wbr>com</a>> wrote:<br>
>><br>
>> Yes but the biggest issue is how to recover<br>
>> You'll need to recover the whole storage not a single snapshot and this<br>
>> can last for days<br>
>><br>
>> Il 23 mar 2017 9:24 PM, "Alvin Starr" <<a href="mailto:alvin@netvel.net">alvin@netvel.net</a>> ha scritto:<br>
>>><br>
>>> For volume backups you need something like snapshots.<br>
>>><br>
>>> If you take a snapshot A of a live volume L that snapshot stays at that<br>
>>> moment in time and you can rsync that to another system or use something<br>
>>> like <a href="http://deltacp.pl" rel="noreferrer" target="_blank">deltacp.pl</a> to copy it.<br>
>>><br>
>>> The usual process is to delete the snapshot once its copied and than<br>
>>> repeat the process again when the next backup is required.<br>
>>><br>
>>> That process does require rsync/deltacp to read the complete volume on<br>
>>> both systems which can take a long time.<br>
>>><br>
>>> I was kicking around the idea to try and handle snapshot deltas better.<br>
>>><br>
>>> The idea is that you could take your initial snapshot A then sync that<br>
>>> snapshot to your backup system.<br>
>>><br>
>>> At a later point you could take another snapshot B.<br>
>>><br>
>>> Because snapshots contain the copies of the original data at the time of<br>
>>> the snapshot and unmodified data points to the Live volume it is possible to<br>
>>> tell what blocks of data have changed since the snapshot was taken.<br>
>>><br>
>>> Now that you have a second snapshot you can in essence perform a diff on<br>
>>> the A and B snapshots to get only the blocks that changed up to the time<br>
>>> that B was taken.<br>
>>><br>
>>> These blocks could be copied to the backup image and you should have a<br>
>>> clone of the B snapshot.<br>
>>><br>
>>> You would not have to read the whole volume image but just the changed<br>
>>> blocks dramatically improving the speed of the backup.<br>
>>><br>
>>> At this point you can delete the A snapshot and promote the B snapshot to<br>
>>> be the A snapshot for the next backup round.<br>
>>><br>
>>><br>
>>> On 03/23/2017 03:53 PM, Gandalf Corvotempesta wrote:<br>
>>><br>
>>> Are backup consistent?<br>
>>> What happens if the header on shard0 is synced referring to some data on<br>
>>> shard450 and when rsync parse shard450 this data is changed by subsequent<br>
>>> writes?<br>
>>><br>
>>> Header would be backupped of sync respect the rest of the image<br>
>>><br>
>>> Il 23 mar 2017 8:48 PM, "Joe Julian" <<a href="mailto:joe@julianfamily.org">joe@julianfamily.org</a>> ha scritto:<br>
>>>><br>
>>>> The rsync protocol only passes blocks that have actually changed. Raw<br>
>>>> changes fewer bits. You're right, though, that it still has to check the<br>
>>>> entire file for those changes.<br>
>>>><br>
>>>><br>
>>>> On 03/23/17 12:47, Gandalf Corvotempesta wrote:<br>
>>>><br>
>>>> Raw or qcow doesn't change anything about the backup.<br>
>>>> Georep always have to sync the whole file<br>
>>>><br>
>>>> Additionally, raw images has much less features than qcow<br>
>>>><br>
>>>> Il 23 mar 2017 8:40 PM, "Joe Julian" <<a href="mailto:joe@julianfamily.org">joe@julianfamily.org</a>> ha scritto:<br>
>>>>><br>
>>>>> I always use raw images. And yes, sharding would also be good.<br>
>>>>><br>
>>>>><br>
>>>>> On 03/23/17 12:36, Gandalf Corvotempesta wrote:<br>
>>>>><br>
>>>>> Georep expose to another problem:<br>
>>>>> When using gluster as storage for VM, the VM file is saved as qcow.<br>
>>>>> Changes are inside the qcow, thus rsync has to sync the whole file every<br>
>>>>> time<br>
>>>>><br>
>>>>> A little workaround would be sharding, as rsync has to sync only the<br>
>>>>> changed shards, but I don't think this is a good solution<br>
>>>>><br>
>>>>> Il 23 mar 2017 8:33 PM, "Joe Julian" <<a href="mailto:joe@julianfamily.org">joe@julianfamily.org</a>> ha scritto:<br>
>>>>>><br>
>>>>>> In many cases, a full backup set is just not feasible. Georep to the<br>
>>>>>> same or different DC may be an option if the bandwidth can keep up with the<br>
>>>>>> change set. If not, maybe breaking the data up into smaller more manageable<br>
>>>>>> volumes where you only keep a smaller set of critical data and just back<br>
>>>>>> that up. Perhaps an object store (swift?) might handle fault tolerance<br>
>>>>>> distribution better for some workloads.<br>
>>>>>><br>
>>>>>> There's no one right answer.<br>
>>>>>><br>
>>>>>><br>
>>>>>> On 03/23/17 12:23, Gandalf Corvotempesta wrote:<br>
>>>>>><br>
>>>>>> Backing up from inside each VM doesn't solve the problem<br>
>>>>>> If you have to backup 500VMs you just need more than 1 day and what if<br>
>>>>>> you have to restore the whole gluster storage?<br>
>>>>>><br>
>>>>>> How many days do you need to restore 1PB?<br>
>>>>>><br>
>>>>>> Probably the only solution should be a georep in the same<br>
>>>>>> datacenter/rack with a similiar cluster,<br>
>>>>>> ready to became the master storage.<br>
>>>>>> In this case you don't need to restore anything as data are already<br>
>>>>>> there,<br>
>>>>>> only a little bit back in time but this double the TCO<br>
>>>>>><br>
>>>>>> Il 23 mar 2017 6:39 PM, "Serkan Çoban" <<a href="mailto:cobanserkan@gmail.com">cobanserkan@gmail.com</a>> ha<br>
>>>>>> scritto:<br>
>>>>>>><br>
>>>>>>> Assuming a backup window of 12 hours, you need to send data at 25GB/s<br>
>>>>>>> to backup solution.<br>
>>>>>>> Using 10G Ethernet on hosts you need at least 25 host to handle<br>
>>>>>>> 25GB/s.<br>
>>>>>>> You can create an EC gluster cluster that can handle this rates, or<br>
>>>>>>> you just backup valuable data from inside VMs using open source<br>
>>>>>>> backup<br>
>>>>>>> tools like borg,attic,restic , etc...<br>
>>>>>>><br>
>>>>>>> On Thu, Mar 23, 2017 at 7:48 PM, Gandalf Corvotempesta<br>
>>>>>>> <<a href="mailto:gandalf.corvotempesta@gmail.com">gandalf.corvotempesta@gmail.<wbr>com</a>> wrote:<br>
>>>>>>> > Let's assume a 1PB storage full of VMs images with each brick over<br>
>>>>>>> > ZFS,<br>
>>>>>>> > replica 3, sharding enabled<br>
>>>>>>> ><br>
>>>>>>> > How do you backup/restore that amount of data?<br>
>>>>>>> ><br>
>>>>>>> > Backing up daily is impossible, you'll never finish the backup that<br>
>>>>>>> > the<br>
>>>>>>> > following one is starting (in other words, you need more than 24<br>
>>>>>>> > hours)<br>
>>>>>>> ><br>
>>>>>>> > Restoring is even worse. You need more than 24 hours with the whole<br>
>>>>>>> > cluster<br>
>>>>>>> > down<br>
>>>>>>> ><br>
>>>>>>> > You can't rely on ZFS snapshot due to sharding (the snapshot took<br>
>>>>>>> > from one<br>
>>>>>>> > node is useless without all other node related at the same shard)<br>
>>>>>>> > and you<br>
>>>>>>> > still have the same restore speed<br>
>>>>>>> ><br>
>>>>>>> > How do you backup this?<br>
>>>>>>> ><br>
>>>>>>> > Even georep isn't enough, if you have to restore the whole storage<br>
>>>>>>> > in case<br>
>>>>>>> > of disaster<br>
>>>>>>> ><br>
>>>>>>> > ______________________________<wbr>_________________<br>
>>>>>>> > Gluster-users mailing list<br>
>>>>>>> > <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
>>>>>>> > <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
>>>>>><br>
>>>>>><br>
>>>>>><br>
>>>>>> ______________________________<wbr>_________________<br>
>>>>>> Gluster-users mailing list<br>
>>>>>> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
>>>>>> <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
>>>>>><br>
>>>>>> ______________________________<wbr>_________________ Gluster-users mailing<br>
>>>>>> list <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
>>>>>> <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
>>><br>
>>> ______________________________<wbr>_________________<br>
>>> Gluster-users mailing list<br>
>>> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
>>> <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
>>><br>
>>> --<br>
>>> Alvin Starr || voice: <a href="tel:%28905%29513-7688" value="+19055137688">(905)513-7688</a><br>
>>> Netvel Inc. || Cell: <a href="tel:%28416%29806-0133" value="+14168060133">(416)806-0133</a><br>
>>> <a href="mailto:alvin@netvel.net">alvin@netvel.net</a> ||<br>
>>><br>
>>><br>
>>> ______________________________<wbr>_________________<br>
>>> Gluster-users mailing list<br>
>>> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
>>> <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
>><br>
>><br>
>> ______________________________<wbr>_________________<br>
>> Gluster-users mailing list<br>
>> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
>> <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
><br>
><br>
</blockquote></div></div>