<div dir="auto">Are backup consistent?<div dir="auto">What happens if the header on shard0 is synced referring to some data on shard450 and when rsync parse shard450 this data is changed by subsequent writes?</div><div dir="auto"><br></div><div dir="auto">Header would be backupped  of sync respect the rest of the image</div></div><div class="gmail_extra"><br><div class="gmail_quote">Il 23 mar 2017 8:48 PM, &quot;Joe Julian&quot; &lt;<a href="mailto:joe@julianfamily.org">joe@julianfamily.org</a>&gt; ha scritto:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    <p>The rsync protocol only passes blocks that have actually changed.
      Raw changes fewer bits. You&#39;re right, though, that it still has to
      check the entire file for those changes.<br>
    </p>
    <br>
    <div class="m_2071367206087675765moz-cite-prefix">On 03/23/17 12:47, Gandalf
      Corvotempesta wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="auto">Raw or qcow doesn&#39;t change anything about the
        backup.
        <div dir="auto">Georep always have to sync the whole file</div>
        <div dir="auto"><br>
        </div>
        <div dir="auto">Additionally, raw images has much less features
          than qcow</div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">Il 23 mar 2017 8:40 PM, &quot;Joe Julian&quot;
          &lt;<a href="mailto:joe@julianfamily.org" target="_blank">joe@julianfamily.org</a>&gt;
          ha scritto:<br type="attribution">
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000">
              <p>I always use raw images. And yes, sharding would also
                be good.<br>
              </p>
              <br>
              <div class="m_2071367206087675765m_-8052554343169692798moz-cite-prefix">On
                03/23/17 12:36, Gandalf Corvotempesta wrote:<br>
              </div>
              <blockquote type="cite">
                <div dir="auto">Georep expose to another problem:
                  <div dir="auto">When using gluster as storage for VM,
                    the VM file is saved as qcow. Changes are inside the
                    qcow, thus rsync has to sync the whole file every
                    time</div>
                  <div dir="auto"><br>
                  </div>
                  <div dir="auto">A little workaround would be sharding,
                    as rsync has to sync only the changed shards, but I
                    don&#39;t think this is a good solution</div>
                </div>
                <div class="gmail_extra"><br>
                  <div class="gmail_quote">Il 23 mar 2017 8:33 PM, &quot;Joe
                    Julian&quot; &lt;<a href="mailto:joe@julianfamily.org" target="_blank">joe@julianfamily.org</a>&gt;
                    ha scritto:<br type="attribution">
                    <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div bgcolor="#FFFFFF" text="#000000">
                        <p>In many cases, a full backup set is just not
                          feasible. Georep to the same or different DC
                          may be an option if the bandwidth can keep up
                          with the change set. If not, maybe breaking
                          the data up into smaller more manageable
                          volumes where you only keep a smaller set of
                          critical data and just back that up. Perhaps
                          an object store (swift?) might handle fault
                          tolerance distribution better for some
                          workloads.</p>
                        <p>There&#39;s no one right answer.</p>
                        <br>
                        <div class="m_2071367206087675765m_-8052554343169692798m_-3599642909736746536moz-cite-prefix">On
                          03/23/17 12:23, Gandalf Corvotempesta wrote:<br>
                        </div>
                        <blockquote type="cite">
                          <div dir="auto">Backing up from inside each VM
                            doesn&#39;t solve the problem
                            <div dir="auto">If you have to backup 500VMs
                              you just need more than 1 day and what if
                              you have to restore the whole gluster
                              storage?</div>
                            <div dir="auto"><br>
                            </div>
                            <div dir="auto">How many days do you need to
                              restore 1PB?</div>
                            <div dir="auto"><br>
                            </div>
                            <div dir="auto">Probably the only solution
                              should be a georep in the same
                              datacenter/rack with a similiar cluster, </div>
                            <div dir="auto">ready to became the master
                              storage.</div>
                            <div dir="auto">In this case you don&#39;t need
                              to restore anything as data are already
                              there, </div>
                            <div dir="auto">only a little bit back in
                              time but this double the TCO</div>
                          </div>
                          <div class="gmail_extra"><br>
                            <div class="gmail_quote">Il 23 mar 2017 6:39
                              PM, &quot;Serkan Çoban&quot; &lt;<a href="mailto:cobanserkan@gmail.com" target="_blank">cobanserkan@gmail.com</a>&gt;
                              ha scritto:<br type="attribution">
                              <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Assuming a
                                backup window of 12 hours, you need to
                                send data at 25GB/s<br>
                                to backup solution.<br>
                                Using 10G Ethernet on hosts you need at
                                least 25 host to handle 25GB/s.<br>
                                You can create an EC gluster cluster
                                that can handle this rates, or<br>
                                you just backup valuable data from
                                inside VMs using open source backup<br>
                                tools like borg,attic,restic , etc...<br>
                                <br>
                                On Thu, Mar 23, 2017 at 7:48 PM, Gandalf
                                Corvotempesta<br>
                                &lt;<a href="mailto:gandalf.corvotempesta@gmail.com" target="_blank">gandalf.corvotempesta@gmail.c<wbr>om</a>&gt;
                                wrote:<br>
                                &gt; Let&#39;s assume a 1PB storage full of
                                VMs images with each brick over ZFS,<br>
                                &gt; replica 3, sharding enabled<br>
                                &gt;<br>
                                &gt; How do you backup/restore that
                                amount of data?<br>
                                &gt;<br>
                                &gt; Backing up daily is impossible,
                                you&#39;ll never finish the backup that the<br>
                                &gt; following one is starting (in other
                                words, you need more than 24 hours)<br>
                                &gt;<br>
                                &gt; Restoring is even worse. You need
                                more than 24 hours with the whole
                                cluster<br>
                                &gt; down<br>
                                &gt;<br>
                                &gt; You can&#39;t rely on ZFS snapshot due
                                to sharding (the snapshot took from one<br>
                                &gt; node is useless without all other
                                node related at the same shard) and you<br>
                                &gt; still have the same restore speed<br>
                                &gt;<br>
                                &gt; How do you backup this?<br>
                                &gt;<br>
                                &gt; Even georep isn&#39;t enough, if you
                                have to restore the whole storage in
                                case<br>
                                &gt; of disaster<br>
                                &gt;<br>
                                &gt; ______________________________<wbr>_________________<br>
                                &gt; Gluster-users mailing list<br>
                                &gt; <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
                                &gt; <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>
                              </blockquote>
                            </div>
                          </div>
                          <br>
                          <fieldset class="m_2071367206087675765m_-8052554343169692798m_-3599642909736746536mimeAttachmentHeader"></fieldset>
                          <br>
                          <pre>______________________________<wbr>_________________
Gluster-users mailing list
<a class="m_2071367206087675765m_-8052554343169692798m_-3599642909736746536moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a class="m_2071367206087675765m_-8052554343169692798m_-3599642909736746536moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a></pre>
    </blockquote>
    

  </div>


______________________________<wbr>_________________

Gluster-users mailing list

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>

<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a>
</blockquote></div></div>



</blockquote>
</div></blockquote></div></div>



</blockquote>
</div></blockquote></div></div>