<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Apr 10, 2018 at 3:08 PM, Niels de Vos <span dir="ltr"><<a href="mailto:ndevos@redhat.com" target="_blank">ndevos@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Recently I have been implementing "volume clone" support in Heketi. This<br>
uses the snapshot+clone functionality from Gluster. In order to create<br>
snapshots and clone them, it is required to use LVM thin-pools on the<br>
bricks. This is where my current problem originates....<br>
<br>
When there are cloned volumes, the bricks of these volumes use the same<br>
thin-pool as the original bricks. This makes sense, and allows cloning<br>
to be really fast! There is no need to copy data from one brick to a new<br>
one, the thin-pool provides copy-on-write semantics.<br>
<br>
Unfortunately it can be rather difficult to estimate how large the<br>
thin-pool should be when the initial Gluster Volume is created.<br>
Over-allocation is likely needed, but by how much? It may not be clear<br>
how many clones there will be made, nor how much % of data will change<br>
on each of the clones.<br>
<br>
A wrong estimate can easily cause the thin-pool to become full. When<br>
that happens, the filesystem on the bricks will go readonly. Mounting<br>
the filesystem read-writable may not be possible at all. I've even seen<br>
/dev entries for the LV getting removed. This makes for a horrible<br>
Gluster experience, and it can be tricky to recover from it.<br>
<br>
In order to make thin-provisioning more stable in Gluster, I would like<br>
to see integrated monitoring of (thin) LVs and some form of acting on<br>
crucial events. One idea would be to make the Gluster Volume read-only<br>
when it detects that a brick is almost out-of-space. This is close to<br>
what local filesystems do when their block-device is having issues.<br>
<br>
The 'dmeventd' process already monitors LVM, and by default writes to<br>
'dmesg'. Checking dmesg for warnings is not really a nice solution, so<br>
maybe we should write a plugin for dmeventd. Possibly something exists<br>
already what we can use, or take inspiration from.<br>
<br>
Please provide ideas, thoughts and any other comments. Thanks!<br></blockquote><div><br></div><div>For the oVirt-Gluster integration, where gluster volumes are managed and consumed as VM image store by oVirt - a feature was added to monitor and report guaranteed capacity for bricks as opposed to the reported size when created on thin-provisioned LVs/vdo devices. The feature page provide some details - <a href="https://ovirt.org/develop/release-management/features/gluster/gluster-multiple-bricks-per-storage/">https://ovirt.org/develop/release-management/features/gluster/gluster-multiple-bricks-per-storage/</a>. Also, adding Denis, the feature owner.<br></div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<span class="gmail-HOEnZb"><font color="#888888">Niels<br>
</font></span></blockquote></div><br></div></div>