<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p><tt>That's true and it can last much longer than days.</tt></p>
    <p><tt>I have a client that has some data-sets that take months to copy
        and are not the biggest data user in the world.</tt></p>
    <p><tt><br>
        The biggest problems with backups is that some day you may need
        to restore them.</tt></p>
    <p><tt></tt><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 03/23/2017 04:29 PM, Gandalf
      Corvotempesta wrote:<br>
    </div>
    <blockquote
cite="mid:CAJH6TXgeFM_s_LfupqstAC0oh9kHkWckyUtjhmnsN3ekmdnGEA@mail.gmail.com"
      type="cite">
      <div dir="auto">Yes but the biggest issue is how to recover
        <div dir="auto">You'll need to recover the whole storage not a
          single snapshot and this can last for days</div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">Il 23 mar 2017 9:24 PM, "Alvin Starr"
          &lt;<a moz-do-not-send="true" href="mailto:alvin@netvel.net">alvin@netvel.net</a>&gt;
          ha scritto:<br type="attribution">
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000">
              <p><tt>For volume backups you need something like
                  snapshots.</tt></p>
              <p><tt>If you take a snapshot A of a live volume L that
                  snapshot stays at that moment in time and you can
                  rsync that to another system or use something like <a
                    moz-do-not-send="true" href="http://deltacp.pl"
                    target="_blank">deltacp.pl</a> to copy it.</tt></p>
              <p><tt>The usual process is to delete the snapshot once
                  its copied and than repeat the process again when the
                  next backup is required.</tt></p>
              <p><tt>That process does require rsync/deltacp to read the
                  complete volume on both systems which can take a long
                  time.<br>
                </tt></p>
              <p><tt>I was kicking around the idea to try and handle
                  snapshot deltas better.</tt></p>
              <p><tt>The idea is that you could take your initial
                  snapshot A then sync that snapshot to your backup
                  system.</tt></p>
              <p><tt>At a later point you could take another snapshot B.</tt></p>
              <p><tt>Because snapshots contain the copies of the
                  original data at the time of the snapshot and
                  unmodified data points to the Live volume it is
                  possible to tell what blocks of data have changed
                  since the snapshot was taken.</tt></p>
              <p><tt>Now that you have a second snapshot you can in
                  essence perform a diff on the A and B snapshots to get
                  only the blocks that changed up to the time that B was
                  taken.</tt></p>
              <p><tt>These blocks could be copied to the backup image
                  and you should have a clone of the B snapshot.</tt></p>
              <p><tt>You would not have to read the whole volume image
                  but just the changed blocks dramatically improving the
                  speed of the backup.<br>
                </tt></p>
              <p><tt>At this point you can delete the A snapshot and
                  promote the B snapshot to be the A snapshot for the
                  next backup round.<br>
                </tt></p>
              <br>
              <div class="m_8694824072006468141moz-cite-prefix">On
                03/23/2017 03:53 PM, Gandalf Corvotempesta wrote:<br>
              </div>
              <blockquote type="cite">
                <div dir="auto">Are backup consistent?
                  <div dir="auto">What happens if the header on shard0
                    is synced referring to some data on shard450 and
                    when rsync parse shard450 this data is changed by
                    subsequent writes?</div>
                  <div dir="auto"><br>
                  </div>
                  <div dir="auto">Header would be backupped  of sync
                    respect the rest of the image</div>
                </div>
                <div class="gmail_extra"><br>
                  <div class="gmail_quote">Il 23 mar 2017 8:48 PM, "Joe
                    Julian" &lt;<a moz-do-not-send="true"
                      href="mailto:joe@julianfamily.org" target="_blank">joe@julianfamily.org</a>&gt;
                    ha scritto:<br type="attribution">
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div bgcolor="#FFFFFF" text="#000000">
                        <p>The rsync protocol only passes blocks that
                          have actually changed. Raw changes fewer bits.
                          You're right, though, that it still has to
                          check the entire file for those changes.<br>
                        </p>
                        <br>
                        <div
                          class="m_8694824072006468141m_2071367206087675765moz-cite-prefix">On
                          03/23/17 12:47, Gandalf Corvotempesta wrote:<br>
                        </div>
                        <blockquote type="cite">
                          <div dir="auto">Raw or qcow doesn't change
                            anything about the backup.
                            <div dir="auto">Georep always have to sync
                              the whole file</div>
                            <div dir="auto"><br>
                            </div>
                            <div dir="auto">Additionally, raw images has
                              much less features than qcow</div>
                          </div>
                          <div class="gmail_extra"><br>
                            <div class="gmail_quote">Il 23 mar 2017 8:40
                              PM, "Joe Julian" &lt;<a
                                moz-do-not-send="true"
                                href="mailto:joe@julianfamily.org"
                                target="_blank">joe@julianfamily.org</a>&gt;
                              ha scritto:<br type="attribution">
                              <blockquote class="gmail_quote"
                                style="margin:0 0 0 .8ex;border-left:1px
                                #ccc solid;padding-left:1ex">
                                <div bgcolor="#FFFFFF" text="#000000">
                                  <p>I always use raw images. And yes,
                                    sharding would also be good.<br>
                                  </p>
                                  <br>
                                  <div
class="m_8694824072006468141m_2071367206087675765m_-8052554343169692798moz-cite-prefix">On
                                    03/23/17 12:36, Gandalf
                                    Corvotempesta wrote:<br>
                                  </div>
                                  <blockquote type="cite">
                                    <div dir="auto">Georep expose to
                                      another problem:
                                      <div dir="auto">When using gluster
                                        as storage for VM, the VM file
                                        is saved as qcow. Changes are
                                        inside the qcow, thus rsync has
                                        to sync the whole file every
                                        time</div>
                                      <div dir="auto"><br>
                                      </div>
                                      <div dir="auto">A little
                                        workaround would be sharding, as
                                        rsync has to sync only the
                                        changed shards, but I don't
                                        think this is a good solution</div>
                                    </div>
                                    <div class="gmail_extra"><br>
                                      <div class="gmail_quote">Il 23 mar
                                        2017 8:33 PM, "Joe Julian" &lt;<a
                                          moz-do-not-send="true"
                                          href="mailto:joe@julianfamily.org"
                                          target="_blank">joe@julianfamily.org</a>&gt;
                                        ha scritto:<br
                                          type="attribution">
                                        <blockquote class="gmail_quote"
                                          style="margin:0 0 0
                                          .8ex;border-left:1px #ccc
                                          solid;padding-left:1ex">
                                          <div bgcolor="#FFFFFF"
                                            text="#000000">
                                            <p>In many cases, a full
                                              backup set is just not
                                              feasible. Georep to the
                                              same or different DC may
                                              be an option if the
                                              bandwidth can keep up with
                                              the change set. If not,
                                              maybe breaking the data up
                                              into smaller more
                                              manageable volumes where
                                              you only keep a smaller
                                              set of critical data and
                                              just back that up. Perhaps
                                              an object store (swift?)
                                              might handle fault
                                              tolerance distribution
                                              better for some workloads.</p>
                                            <p>There's no one right
                                              answer.</p>
                                            <br>
                                            <div
class="m_8694824072006468141m_2071367206087675765m_-8052554343169692798m_-3599642909736746536moz-cite-prefix">On
                                              03/23/17 12:23, Gandalf
                                              Corvotempesta wrote:<br>
                                            </div>
                                            <blockquote type="cite">
                                              <div dir="auto">Backing up
                                                from inside each VM
                                                doesn't solve the
                                                problem
                                                <div dir="auto">If you
                                                  have to backup 500VMs
                                                  you just need more
                                                  than 1 day and what if
                                                  you have to restore
                                                  the whole gluster
                                                  storage?</div>
                                                <div dir="auto"><br>
                                                </div>
                                                <div dir="auto">How many
                                                  days do you need to
                                                  restore 1PB?</div>
                                                <div dir="auto"><br>
                                                </div>
                                                <div dir="auto">Probably
                                                  the only solution
                                                  should be a georep in
                                                  the same
                                                  datacenter/rack with a
                                                  similiar cluster, </div>
                                                <div dir="auto">ready to
                                                  became the master
                                                  storage.</div>
                                                <div dir="auto">In this
                                                  case you don't need to
                                                  restore anything as
                                                  data are already
                                                  there, </div>
                                                <div dir="auto">only a
                                                  little bit back in
                                                  time but this double
                                                  the TCO</div>
                                              </div>
                                              <div class="gmail_extra"><br>
                                                <div class="gmail_quote">Il
                                                  23 mar 2017 6:39 PM,
                                                  "Serkan Çoban" &lt;<a
moz-do-not-send="true" href="mailto:cobanserkan@gmail.com"
                                                    target="_blank">cobanserkan@gmail.com</a>&gt;
                                                  ha scritto:<br
                                                    type="attribution">
                                                  <blockquote
                                                    class="gmail_quote"
                                                    style="margin:0 0 0
                                                    .8ex;border-left:1px
                                                    #ccc
                                                    solid;padding-left:1ex">Assuming
                                                    a backup window of
                                                    12 hours, you need
                                                    to send data at
                                                    25GB/s<br>
                                                    to backup solution.<br>
                                                    Using 10G Ethernet
                                                    on hosts you need at
                                                    least 25 host to
                                                    handle 25GB/s.<br>
                                                    You can create an EC
                                                    gluster cluster that
                                                    can handle this
                                                    rates, or<br>
                                                    you just backup
                                                    valuable data from
                                                    inside VMs using
                                                    open source backup<br>
                                                    tools like
                                                    borg,attic,restic ,
                                                    etc...<br>
                                                    <br>
                                                    On Thu, Mar 23, 2017
                                                    at 7:48 PM, Gandalf
                                                    Corvotempesta<br>
                                                    &lt;<a
                                                      moz-do-not-send="true"
href="mailto:gandalf.corvotempesta@gmail.com" target="_blank">gandalf.corvotempesta@gmail.c<wbr>om</a>&gt;
                                                    wrote:<br>
                                                    &gt; Let's assume a
                                                    1PB storage full of
                                                    VMs images with each
                                                    brick over ZFS,<br>
                                                    &gt; replica 3,
                                                    sharding enabled<br>
                                                    &gt;<br>
                                                    &gt; How do you
                                                    backup/restore that
                                                    amount of data?<br>
                                                    &gt;<br>
                                                    &gt; Backing up
                                                    daily is impossible,
                                                    you'll never finish
                                                    the backup that the<br>
                                                    &gt; following one
                                                    is starting (in
                                                    other words, you
                                                    need more than 24
                                                    hours)<br>
                                                    &gt;<br>
                                                    &gt; Restoring is
                                                    even worse. You need
                                                    more than 24 hours
                                                    with the whole
                                                    cluster<br>
                                                    &gt; down<br>
                                                    &gt;<br>
                                                    &gt; You can't rely
                                                    on ZFS snapshot due
                                                    to sharding (the
                                                    snapshot took from
                                                    one<br>
                                                    &gt; node is useless
                                                    without all other
                                                    node related at the
                                                    same shard) and you<br>
                                                    &gt; still have the
                                                    same restore speed<br>
                                                    &gt;<br>
                                                    &gt; How do you
                                                    backup this?<br>
                                                    &gt;<br>
                                                    &gt; Even georep
                                                    isn't enough, if you
                                                    have to restore the
                                                    whole storage in
                                                    case<br>
                                                    &gt; of disaster<br>
                                                    &gt;<br>
                                                    &gt;
                                                    ______________________________<wbr>_________________<br>
                                                    &gt; Gluster-users
                                                    mailing list<br>
                                                    &gt; <a
                                                      moz-do-not-send="true"
href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
                                                    &gt; <a
                                                      moz-do-not-send="true"
href="http://lists.gluster.org/mailman/listinfo/gluster-users"
                                                      rel="noreferrer"
                                                      target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>
                                                  </blockquote>
                                                </div>
                                              </div>
                                              <br>
                                              <fieldset
class="m_8694824072006468141m_2071367206087675765m_-8052554343169692798m_-3599642909736746536mimeAttachmentHeader"></fieldset>
                                              <br>
                                              <pre>______________________________<wbr>_________________
Gluster-users mailing list
<a moz-do-not-send="true" class="m_8694824072006468141m_2071367206087675765m_-8052554343169692798m_-3599642909736746536moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" class="m_8694824072006468141m_2071367206087675765m_-8052554343169692798m_-3599642909736746536moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a></pre>
    </blockquote>
    

  </div>


______________________________<wbr>_________________

Gluster-users mailing list

<a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a moz-do-not-send="true" href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a>
</blockquote></div></div>



</blockquote>
</div></blockquote></div></div>



</blockquote>
</div></blockquote></div></div>


<fieldset class="m_8694824072006468141mimeAttachmentHeader"></fieldset>
<pre>______________________________<wbr>_________________
Gluster-users mailing list
<a moz-do-not-send="true" class="m_8694824072006468141moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" class="m_8694824072006468141moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a></pre>

</blockquote>
<pre class="m_8694824072006468141moz-signature" cols="72">-- 
Alvin Starr                   ||   voice: <a moz-do-not-send="true" href="tel:%28905%29%20513-7688" value="+19055137688" target="_blank">(905)513-7688</a>
Netvel Inc.                   ||   Cell:  <a moz-do-not-send="true" href="tel:%28416%29%20806-0133" value="+14168060133" target="_blank">(416)806-0133</a>
<a moz-do-not-send="true" class="m_8694824072006468141moz-txt-link-abbreviated" href="mailto:alvin@netvel.net" target="_blank">alvin@netvel.net</a>              ||
</pre></div>
______________________________<wbr>_________________

Gluster-users mailing list

<a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>

<a moz-do-not-send="true" href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a>
</blockquote></div></div>



</blockquote>
<pre class="moz-signature" cols="72">-- 
Alvin Starr                   ||   voice: (905)513-7688
Netvel Inc.                   ||   Cell:  (416)806-0133
<a class="moz-txt-link-abbreviated" href="mailto:alvin@netvel.net">alvin@netvel.net</a>              ||
</pre></body></html>