<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <p><tt>That's true and it can last much longer than days.</tt></p>

    <p><tt>I have a client that has some data-sets that take months to copy

        and are not the biggest data user in the world.</tt></p>

    <p><tt><br>

        The biggest problems with backups is that some day you may need

        to restore them.</tt></p>

    <p><tt></tt><br>

    </p>

    <br>

    <div class="moz-cite-prefix">On 03/23/2017 04:29 PM, Gandalf

      Corvotempesta wrote:<br>

    </div>

    <blockquote

cite="mid:CAJH6TXgeFM_s_LfupqstAC0oh9kHkWckyUtjhmnsN3ekmdnGEA@mail.gmail.com"

      type="cite">

      <div dir="auto">Yes but the biggest issue is how to recover

        <div dir="auto">You'll need to recover the whole storage not a

          single snapshot and this can last for days</div>

      </div>

      <div class="gmail_extra"><br>

        <div class="gmail_quote">Il 23 mar 2017 9:24 PM, "Alvin Starr"

          &lt;<a moz-do-not-send="true" href="mailto:alvin@netvel.net">alvin@netvel.net</a>&gt;

          ha scritto:<br type="attribution">

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div bgcolor="#FFFFFF" text="#000000">

              <p><tt>For volume backups you need something like

                  snapshots.</tt></p>

              <p><tt>If you take a snapshot A of a live volume L that

                  snapshot stays at that moment in time and you can

                  rsync that to another system or use something like <a

                    moz-do-not-send="true" href="http://deltacp.pl"

                    target="_blank">deltacp.pl</a> to copy it.</tt></p>

              <p><tt>The usual process is to delete the snapshot once

                  its copied and than repeat the process again when the

                  next backup is required.</tt></p>

              <p><tt>That process does require rsync/deltacp to read the

                  complete volume on both systems which can take a long

                  time.<br>

                </tt></p>

              <p><tt>I was kicking around the idea to try and handle

                  snapshot deltas better.</tt></p>

              <p><tt>The idea is that you could take your initial

                  snapshot A then sync that snapshot to your backup

                  system.</tt></p>

              <p><tt>At a later point you could take another snapshot B.</tt></p>

              <p><tt>Because snapshots contain the copies of the

                  original data at the time of the snapshot and

                  unmodified data points to the Live volume it is

                  possible to tell what blocks of data have changed

                  since the snapshot was taken.</tt></p>

              <p><tt>Now that you have a second snapshot you can in

                  essence perform a diff on the A and B snapshots to get

                  only the blocks that changed up to the time that B was

                  taken.</tt></p>

              <p><tt>These blocks could be copied to the backup image

                  and you should have a clone of the B snapshot.</tt></p>

              <p><tt>You would not have to read the whole volume image

                  but just the changed blocks dramatically improving the

                  speed of the backup.<br>

                </tt></p>

              <p><tt>At this point you can delete the A snapshot and

                  promote the B snapshot to be the A snapshot for the

                  next backup round.<br>

                </tt></p>

              <br>

              <div class="m_8694824072006468141moz-cite-prefix">On

                03/23/2017 03:53 PM, Gandalf Corvotempesta wrote:<br>

              </div>

              <blockquote type="cite">

                <div dir="auto">Are backup consistent?

                  <div dir="auto">What happens if the header on shard0

                    is synced referring to some data on shard450 and

                    when rsync parse shard450 this data is changed by

                    subsequent writes?</div>

                  <div dir="auto"><br>

                  </div>

                  <div dir="auto">Header would be backupped  of sync

                    respect the rest of the image</div>

                </div>

                <div class="gmail_extra"><br>

                  <div class="gmail_quote">Il 23 mar 2017 8:48 PM, "Joe

                    Julian" &lt;<a moz-do-not-send="true"

                      href="mailto:joe@julianfamily.org" target="_blank">joe@julianfamily.org</a>&gt;

                    ha scritto:<br type="attribution">

                    <blockquote class="gmail_quote" style="margin:0 0 0

                      .8ex;border-left:1px #ccc solid;padding-left:1ex">

                      <div bgcolor="#FFFFFF" text="#000000">

                        <p>The rsync protocol only passes blocks that

                          have actually changed. Raw changes fewer bits.

                          You're right, though, that it still has to

                          check the entire file for those changes.<br>

                        </p>

                        <br>

                        <div

                          class="m_8694824072006468141m_2071367206087675765moz-cite-prefix">On

                          03/23/17 12:47, Gandalf Corvotempesta wrote:<br>

                        </div>

                        <blockquote type="cite">

                          <div dir="auto">Raw or qcow doesn't change

                            anything about the backup.

                            <div dir="auto">Georep always have to sync

                              the whole file</div>

                            <div dir="auto"><br>

                            </div>

                            <div dir="auto">Additionally, raw images has

                              much less features than qcow</div>

                          </div>

                          <div class="gmail_extra"><br>

                            <div class="gmail_quote">Il 23 mar 2017 8:40

                              PM, "Joe Julian" &lt;<a

                                moz-do-not-send="true"

                                href="mailto:joe@julianfamily.org"

                                target="_blank">joe@julianfamily.org</a>&gt;

                              ha scritto:<br type="attribution">

                              <blockquote class="gmail_quote"

                                style="margin:0 0 0 .8ex;border-left:1px

                                #ccc solid;padding-left:1ex">

                                <div bgcolor="#FFFFFF" text="#000000">

                                  <p>I always use raw images. And yes,

                                    sharding would also be good.<br>

                                  </p>

                                  <br>

                                  <div

class="m_8694824072006468141m_2071367206087675765m_-8052554343169692798moz-cite-prefix">On

                                    03/23/17 12:36, Gandalf

                                    Corvotempesta wrote:<br>

                                  </div>

                                  <blockquote type="cite">

                                    <div dir="auto">Georep expose to

                                      another problem:

                                      <div dir="auto">When using gluster

                                        as storage for VM, the VM file

                                        is saved as qcow. Changes are

                                        inside the qcow, thus rsync has

                                        to sync the whole file every

                                        time</div>

                                      <div dir="auto"><br>

                                      </div>

                                      <div dir="auto">A little

                                        workaround would be sharding, as

                                        rsync has to sync only the

                                        changed shards, but I don't

                                        think this is a good solution</div>

                                    </div>

                                    <div class="gmail_extra"><br>

                                      <div class="gmail_quote">Il 23 mar

                                        2017 8:33 PM, "Joe Julian" &lt;<a

                                          moz-do-not-send="true"

                                          href="mailto:joe@julianfamily.org"

                                          target="_blank">joe@julianfamily.org</a>&gt;

                                        ha scritto:<br

                                          type="attribution">

                                        <blockquote class="gmail_quote"

                                          style="margin:0 0 0

                                          .8ex;border-left:1px #ccc

                                          solid;padding-left:1ex">

                                          <div bgcolor="#FFFFFF"

                                            text="#000000">

                                            <p>In many cases, a full

                                              backup set is just not

                                              feasible. Georep to the

                                              same or different DC may

                                              be an option if the

                                              bandwidth can keep up with

                                              the change set. If not,

                                              maybe breaking the data up

                                              into smaller more

                                              manageable volumes where

                                              you only keep a smaller

                                              set of critical data and

                                              just back that up. Perhaps

                                              an object store (swift?)

                                              might handle fault

                                              tolerance distribution

                                              better for some workloads.</p>

                                            <p>There's no one right

                                              answer.</p>

                                            <br>

                                            <div

class="m_8694824072006468141m_2071367206087675765m_-8052554343169692798m_-3599642909736746536moz-cite-prefix">On

                                              03/23/17 12:23, Gandalf

                                              Corvotempesta wrote:<br>

                                            </div>

                                            <blockquote type="cite">

                                              <div dir="auto">Backing up

                                                from inside each VM

                                                doesn't solve the

                                                problem

                                                <div dir="auto">If you

                                                  have to backup 500VMs

                                                  you just need more

                                                  than 1 day and what if

                                                  you have to restore

                                                  the whole gluster

                                                  storage?</div>

                                                <div dir="auto"><br>

                                                </div>

                                                <div dir="auto">How many

                                                  days do you need to

                                                  restore 1PB?</div>

                                                <div dir="auto"><br>

                                                </div>

                                                <div dir="auto">Probably

                                                  the only solution

                                                  should be a georep in

                                                  the same

                                                  datacenter/rack with a

                                                  similiar cluster, </div>

                                                <div dir="auto">ready to

                                                  became the master

                                                  storage.</div>

                                                <div dir="auto">In this

                                                  case you don't need to

                                                  restore anything as

                                                  data are already

                                                  there, </div>

                                                <div dir="auto">only a

                                                  little bit back in

                                                  time but this double

                                                  the TCO</div>

                                              </div>

                                              <div class="gmail_extra"><br>

                                                <div class="gmail_quote">Il

                                                  23 mar 2017 6:39 PM,

                                                  "Serkan Çoban" &lt;<a

moz-do-not-send="true" href="mailto:cobanserkan@gmail.com"

                                                    target="_blank">cobanserkan@gmail.com</a>&gt;

                                                  ha scritto:<br

                                                    type="attribution">

                                                  <blockquote

                                                    class="gmail_quote"

                                                    style="margin:0 0 0

                                                    .8ex;border-left:1px

                                                    #ccc

                                                    solid;padding-left:1ex">Assuming

                                                    a backup window of

                                                    12 hours, you need

                                                    to send data at

                                                    25GB/s<br>

                                                    to backup solution.<br>

                                                    Using 10G Ethernet

                                                    on hosts you need at

                                                    least 25 host to

                                                    handle 25GB/s.<br>

                                                    You can create an EC

                                                    gluster cluster that

                                                    can handle this

                                                    rates, or<br>

                                                    you just backup

                                                    valuable data from

                                                    inside VMs using

                                                    open source backup<br>

                                                    tools like

                                                    borg,attic,restic ,

                                                    etc...<br>

                                                    <br>

                                                    On Thu, Mar 23, 2017

                                                    at 7:48 PM, Gandalf

                                                    Corvotempesta<br>

                                                    &lt;<a

                                                      moz-do-not-send="true"

href="mailto:gandalf.corvotempesta@gmail.com" target="_blank">gandalf.corvotempesta@gmail.c<wbr>om</a>&gt;

                                                    wrote:<br>

                                                    &gt; Let's assume a

                                                    1PB storage full of

                                                    VMs images with each

                                                    brick over ZFS,<br>

                                                    &gt; replica 3,

                                                    sharding enabled<br>

                                                    &gt;<br>

                                                    &gt; How do you

                                                    backup/restore that

                                                    amount of data?<br>

                                                    &gt;<br>

                                                    &gt; Backing up

                                                    daily is impossible,

                                                    you'll never finish

                                                    the backup that the<br>

                                                    &gt; following one

                                                    is starting (in

                                                    other words, you

                                                    need more than 24

                                                    hours)<br>

                                                    &gt;<br>

                                                    &gt; Restoring is

                                                    even worse. You need

                                                    more than 24 hours

                                                    with the whole

                                                    cluster<br>

                                                    &gt; down<br>

                                                    &gt;<br>

                                                    &gt; You can't rely

                                                    on ZFS snapshot due

                                                    to sharding (the

                                                    snapshot took from

                                                    one<br>

                                                    &gt; node is useless

                                                    without all other

                                                    node related at the

                                                    same shard) and you<br>

                                                    &gt; still have the

                                                    same restore speed<br>

                                                    &gt;<br>

                                                    &gt; How do you

                                                    backup this?<br>

                                                    &gt;<br>

                                                    &gt; Even georep

                                                    isn't enough, if you

                                                    have to restore the

                                                    whole storage in

                                                    case<br>

                                                    &gt; of disaster<br>

                                                    &gt;<br>

                                                    &gt;

                                                    ______________________________<wbr>_________________<br>

                                                    &gt; Gluster-users

                                                    mailing list<br>

                                                    &gt; <a

                                                      moz-do-not-send="true"

href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

                                                    &gt; <a

                                                      moz-do-not-send="true"

href="http://lists.gluster.org/mailman/listinfo/gluster-users"

                                                      rel="noreferrer"

                                                      target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>

                                                  </blockquote>

                                                </div>

                                              </div>

                                              <br>

                                              <fieldset

class="m_8694824072006468141m_2071367206087675765m_-8052554343169692798m_-3599642909736746536mimeAttachmentHeader"></fieldset>

                                              <br>

                                              <pre>______________________________<wbr>_________________

Gluster-users mailing list

<a moz-do-not-send="true" class="m_8694824072006468141m_2071367206087675765m_-8052554343169692798m_-3599642909736746536moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a moz-do-not-send="true" class="m_8694824072006468141m_2071367206087675765m_-8052554343169692798m_-3599642909736746536moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a></pre>

    </blockquote>

  </div>

______________________________<wbr>_________________

Gluster-users mailing list

<a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a moz-do-not-send="true" href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a>

</blockquote></div></div>

</blockquote>

</div></blockquote></div></div>

</blockquote>

</div></blockquote></div></div>

<fieldset class="m_8694824072006468141mimeAttachmentHeader"></fieldset>

<pre>______________________________<wbr>_________________

Gluster-users mailing list

<a moz-do-not-send="true" class="m_8694824072006468141moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a moz-do-not-send="true" class="m_8694824072006468141moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a></pre>

</blockquote>

<pre class="m_8694824072006468141moz-signature" cols="72">-- 

Alvin Starr                   ||   voice: <a moz-do-not-send="true" href="tel:%28905%29%20513-7688" value="+19055137688" target="_blank">(905)513-7688</a>

Netvel Inc.                   ||   Cell:  <a moz-do-not-send="true" href="tel:%28416%29%20806-0133" value="+14168060133" target="_blank">(416)806-0133</a>

<a moz-do-not-send="true" class="m_8694824072006468141moz-txt-link-abbreviated" href="mailto:alvin@netvel.net" target="_blank">alvin@netvel.net</a>              ||

</pre></div>

______________________________<wbr>_________________

Gluster-users mailing list

<a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>

<a moz-do-not-send="true" href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a>

</blockquote></div></div>

</blockquote>

<pre class="moz-signature" cols="72">-- 

Alvin Starr                   ||   voice: (905)513-7688

Netvel Inc.                   ||   Cell:  (416)806-0133

<a class="moz-txt-link-abbreviated" href="mailto:alvin@netvel.net">alvin@netvel.net</a>              ||

</pre></body></html>