[Gluster-users] Bricks as BTRFS

James purpleidea at gmail.com
Fri Sep 26 19:40:04 UTC 2014


On Fri, Sep 26, 2014 at 3:15 PM, Ric Wheeler <rwheeler at redhat.com> wrote:
> On 09/26/2014 01:58 PM, James wrote:
>>
>> On Thu, Sep 25, 2014 at 2:53 AM, Venky Shankar <vshankar at redhat.com>
>> wrote:
>>>
>>> Hey folks,
>>>
>>> Wanted to check if anyone out here uses BTRFS (and willing to share their
>>> experiences[1]) as the backend filesystem for GlusterFS. We're planning
>>> to
>>> explore some of it's features and put it to use for GlusterFS. This was
>>> discussed briefly during the weekly meeting on #gluster-meeting[2].
>>>
>>> To start with, we plan to explore data/metadata checksumming (+
>>> scrubbing)
>>> and subvolumes to "offload" the work to BTRFS. The mentioned features
>>> would
>>> help us with BitRot detection[3] and Openstack Manila use cases
>>> respectively
>>> (though there are various other nifty things one would want to do with
>>> them).
>>>
>>> Thanks in advance!
>>
>>
>> Hey,
>>
>> I couldn't make the meeting, but I am interested in BTRFS. I added
>> this in puppet-gluster a bunch of months ago as a feature branch.
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1094860
>>
>> I just pushed it to git master.
>>
>>
>> https://github.com/purpleidea/puppet-gluster/commit/6c962083d8b100dcaeb6f11dbe61e6071f3d13f0
>>
>> The reason I want btrfs support, is I want glusterfs to eventually be
>> able to support reflinks across gluster volumes. There is a strong use
>> case for this feature.
>>
>> Let me know if this helps!
>> Cheers,
>> James
>>
>
> Reflinks in btrfs (or ocfs2) need to be between files in the same linux
> kernel instance of btrfs.  Effectively, we have two inodes backed by the
> same physical blocks.
>
> It won't, in general, be useful for reflinks across volumes....
>
> Regards,
>
> Ric


Agreed... Which is why this isn't a trivial thing for GlusterFS to do,
but we've discussed certain mechanisms to emulate this behaviour
across a Gluster volume. For example:

* If the reflink causes the file to be on the same brick, just reflink.
* If the reflink causes the file to be on a different brick, then
reflink to self, and put a pointer to that original brick
* If we want to reflink across volumes, then it's tricky, because fuse
would have to pass this information through and down to the
filesystem.

The winning use case for this feature is that someone could
backup/restore petabytes of data "virtually instantly". This is
possible with single volume things, but I'd like to scale this to a
distributed-replicated data store.


More information about the Gluster-users mailing list