[Gluster-devel] How does Gluster works - Locating files after change sin cluster

Fri Sep 6 03:55:39 UTC 2019

On Thu, 5 Sep 2019 at 18:33, Raghavendra Talur <rtalur at redhat.com> wrote:

>
>
> On Wed, Sep 4, 2019 at 5:01 AM Barak Sason Rofman <bsasonro at redhat.com>
> wrote:
>
>> Hello everyone,
>>
>> I'm about to post several threads with question regarding how Gluster
>> handles different scenarios.
>> I'm looking for answers on architecture/design/"the is the idea" level,
>> and not specifically implementation (however, it would be nice to know
>> where the relevant code is).
>>
>> In this thread I want to focus on the "adding servers/bricks" scenario.
>> From what I know at this point, every file that's created is given a
>> 32-bit value based on it's name, and this hashing function is fixed and
>> independent of any factors.
>> Next, there is a function (a routing method), located on the client side,
>> that *is* dependent on outside factors, such as numbers of servers (or
>> bricks) in the system which determines on which server a particular file is
>> located.
>>
>> Let's examine the following case:
>> Assume (for simplicity's sake) that the hashing function assign values to
>> file in 1-100 range (instead of 32-bit) and currently there are 4 servers
>> in the cluster.
>> In this case, files 1-25 would be located on server 1, 26-50 on server 2
>> and so on.
>> Now, if a 5th server is added to the cluster, then the ranges will
>> change: files 1-20 will be located on server 1, 21-40 on server 2 and so on.
>>
>> The questions regarding this scenarios are as follows:
>> 1 - Does the servers update the clients that an additional server (or
>> brick) has been added to the cluster? If not, how does this happen?
>>
>
> Yes, addition of a brick happens through a gluster cli command that
> updates the volume info in glusterd. Glusterd(the one which updated config
> and other peers) update clients about this change.
>
> 2 - Does the server also know which files *should* be located on them? if
>> so, does the servers create a link file (which specifies the "real"
>> location of the file) for the files that are supposed to be moved (e.g.
>> files 21-25) or actually move the data right away? Maybe this works in a
>> completely different manner?
>>
>
> The addition of a brick has a step for updating the xattrs on the bricks
> which marks the range for them. The creation of link files happens lazily.
> Clients look up on all bricks when they don't find the file on the brick
> where it is supposed to be(called hashed brick), the brick where they find
> the file is called cached brick and a link file is created.
>
> To add to this, directories which were created before the bricks were
added will not include the new bricks in the layout until a rebalance or
fix-layout is run. Directories created after the add-brick will include the
newly added bricks in the range.

> For more information on distribute mechanism refer to
> https://docs.gluster.org/en/latest/Quick-Start-Guide/Architecture/#dhtdistributed-hash-table-translator
> For more information on how clients get update from glusterd refer to
> https://www.youtube.com/watch?v=Gq-yBYq8Gjg
>
>
>> I have additional questions regarding this, but they are dependent om the
>> answers to these question.
>>
>> Thank you all for your help.
>> --
>> *Barak Sason Rofman*
>>
>> Gluster Storage Development
>>
>> Red Hat Israel <https://www.redhat.com/>
>>
>> 34 Jerusalem rd. Ra'anana, 43501
>>
>> bsasonro at redhat.com <adi at redhat.com>    T: *+972-9-7692304*
>> M: *+972-52-4326355*
>> <https://red.ht/sig>
>> _______________________________________________
>>
>> Community Meeting Calendar:
>>
>> APAC Schedule -
>> Every 2nd and 4th Tuesday at 11:30 AM IST
>> Bridge: https://bluejeans.com/836554017
>>
>> NA/EMEA Schedule -
>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>> Bridge: https://bluejeans.com/486278655
>>
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>> _______________________________________________
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/836554017
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/486278655
>
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20190906/0bf024b5/attachment-0001.html>