<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, May 1, 2017 at 11:20 PM, Shyam <span dir="ltr"><<a href="mailto:srangana@redhat.com" target="_blank">srangana@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 05/01/2017 01:13 PM, Pranith Kumar Karampuri wrote:<br>
</span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
<br>
<br>
On Mon, May 1, 2017 at 10:42 PM, Pranith Kumar Karampuri<br></span><span class="">
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a> <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>> wrote:<br>
<br>
<br>
<br>
On Mon, May 1, 2017 at 10:39 PM, Gandalf Corvotempesta<br>
<<a href="mailto:gandalf.corvotempesta@gmail.com" target="_blank">gandalf.corvotempesta@gmail.c<wbr>om</a><br></span><span class="">
<mailto:<a href="mailto:gandalf.corvotempesta@gmail.com" target="_blank">gandalf.corvotempesta@<wbr>gmail.com</a>>> wrote:<br>
<br>
2017-05-01 18:57 GMT+02:00 Pranith Kumar Karampuri<br></span>
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a> <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>:<div><div class="h5"><br>
> Yes this is precisely what all the other SDS with metadata servers kind of<br>
> do. They kind of keep a map of on what all servers a particular file/blob is<br>
> stored in a metadata server.<br>
<br>
Not exactly. Other SDS has some servers dedicated to metadata and,<br>
personally, I don't like that approach.<br>
<br>
> GlusterFS doesn't do that. In GlusterFS what<br>
> bricks need to be replicated is always given and distribute layer on top of<br>
> these replication layer will do the job of distributing and fetching the<br>
> data. Because replication happens at a brick level and not at a file level<br>
> and distribute happens on top of replication and not at file level. There<br>
> isn't too much metadata that needs to be stored per file. Hence no need for<br>
> separate metadata servers.<br>
<br>
And this is great, that's why i'm talking about embedding a sort<br>
of database<br>
to be stored on all nodes. no metadata servers, only a mapping<br>
between files<br>
and servers.<br>
<br>
> If you know path of the file, you can always know where the file is stored<br>
> using pathinfo:<br>
> Method-2 in the following link:<br>
> <a href="https://gluster.readthedocs.io/en/latest/Troubleshooting/gfid-to-path/" rel="noreferrer" target="_blank">https://gluster.readthedocs.io<wbr>/en/latest/Troubleshooting/gfi<wbr>d-to-path/</a><br>
<<a href="https://gluster.readthedocs.io/en/latest/Troubleshooting/gfid-to-path/" rel="noreferrer" target="_blank">https://gluster.readthedocs.i<wbr>o/en/latest/Troubleshooting/gf<wbr>id-to-path/</a>><br>
><br>
> You don't need any db.<br>
<br>
For the current gluster yes.<br>
I'm talking about a different thing.<br>
<br>
In a RAID, you have data stored somewhere on the array, with<br>
metadata<br>
defining how this data should<br>
be wrote or read. obviously, raid metadata must be stored in a fixed<br>
position, or you won't be able to read<br>
that.<br>
<br>
Something similiar could be added in gluster (i don't know if it<br>
would<br>
be hard): you store a file mapping in a fixed<br>
position in gluster, then all gluster clients will be able to know<br>
where a file is by looking at this "metadata" stored in<br>
the fixed position.<br>
<br>
Like ".gluster" directory. Gluster is using some "internal"<br>
directories for internal operations (".shards", ".gluster",<br>
".trash")<br>
A ".metadata" with file mapping would be hard to add ?<br>
<br>
> Basically what you want, if I understood correctly is:<br>
> If we add a 3rd node with just one disk, the data should automatically<br>
> arrange itself splitting itself to 3 categories(Assuming replica-2)<br>
> 1) Files that are present in Node1, Node2<br>
> 2) Files that are present in Node2, Node3<br>
> 3) Files that are present in Node1, Node3<br>
><br>
> As you can see we arrived at a contradiction where all the nodes should have<br>
> at least 2 bricks but there is only 1 disk. Hence the contradiction. We<br>
> can't do what you are asking without brick splitting. i.e. we need to split<br>
> the disk into 2 bricks.<br>
</div></div></blockquote>
<br>
Splitting the bricks need not be a post factum decision, we can start with larger brick counts, on a given node/disk count, and hence spread these bricks to newer nodes/bricks as they are added.<br></blockquote><div><br>Let's say we have 1 disk, we format it with say XFS and that becomes a brick at the moment. Just curious, what will be the relationship between brick to disk in this case(If we leave out LVM for this example)? <br><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
If I understand the ceph PG count, it works on a similar notion, till the cluster grows beyond the initial PG count (set for the pool) at which point there is a lot more data movement (as the pg count has to be increased, and hence existing PGs need to be further partitioned) . (just using ceph as an example, a similar approach exists for openstack swift with their partition power settings).<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
<br>
I don't think so.<br>
Let's assume a replica 2.<br>
<br>
S1B1 + S2B1<br>
<br>
1TB each, thus 1TB available (2TB/2)<br>
<br>
Adding a third 1TB disks should increase available space to<br>
1.5TB (3TB/2)<br>
<br>
<br>
I agree it should. Question is how? What will be the resulting<br>
brick-map?<br>
<br>
<br>
I don't see any solution that we can do without at least 2 bricks on<br>
each of the 3 servers.<br>
<br>
<br>
<br>
<br>
--<br>
Pranith<br>
<br>
<br>
<br>
<br>
--<br>
Pranith<br>
<br>
<br></span><span class="">
______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>
<br>
</span></blockquote>
</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Pranith<br></div></div>
</div></div>