[Gluster-devel] EHT / DHT

Jan H Holtzhausen janh at holtztech.info
Wed Nov 26 07:48:24 UTC 2014


I could tell you… 
But Symantec wouldn’t like it…..

From:  Poornima Gurusiddaiah <pgurusid at redhat.com>
Date:  Wednesday 26 November 2014 at 7:16 AM
To:  Jan H Holtzhausen <janh at holtztech.info>
Cc:  <gluster-devel at gluster.org>
Subject:  Re: [Gluster-devel] EHT / DHT

Out of curiosity, what back end and deduplication solution are you using?

Regards,
Poornima

From: "Jan H Holtzhausen" <janh at holtztech.info>
To: "Anand Avati" <avati at gluster.org>, "Shyam" <srangana at redhat.com>, 
gluster-devel at gluster.org
Sent: Wednesday, November 26, 2014 3:43:36 AM
Subject: Re: [Gluster-devel] EHT / DHT

Yes we have deduplication at the filesystem layer

BR
Jan

From:  Anand Avati <avati at gluster.org>
Date:  Wednesday 26 November 2014 at 12:11 AM
To:  Jan H Holtzhausen <janh at holtztech.info>, Shyam <srangana at redhat.com>, 
<gluster-devel at gluster.org>
Subject:  Re: [Gluster-devel] EHT / DHT

Unless there is some sort of de-duplication under the covers happening in 
the brick, or the files are hardlinks to each other, there is no cache 
benefit whatsoever by having identical files placed on the same server.

Thanks,
Avati

On Tue Nov 25 2014 at 12:59:25 PM Jan H Holtzhausen <janh at holtztech.info> 
wrote:
As to the why.
Filesystem cache hits.
Files with the same name tend to be the same files.

Regards
Jan




On 2014/11/25, 8:42 PM, "Jan H Holtzhausen" <janh at holtztech.info> wrote:

>So in a distributed cluster, the GFID tells all bricks what a files
>preceding directory structure looks like?
>Where the physical file is saved is a function of the filename ONLY.
>Therefore My requirement should be met by default, or am I being dense?
>
>BR
>Jan
>
>
>
>On 2014/11/25, 8:15 PM, "Shyam" <srangana at redhat.com> wrote:
>
>>On 11/25/2014 03:11 PM, Jan H Holtzhausen wrote:
>>> STILL doesn’t work … exact same file ends up on 2 different bricks …
>>> I must be missing something.
>>> All I need is for:
>>> /directory1/subdirectory2/foo
>>> And
>>> /directory2/subdirectoryaaa999/foo
>>>
>>>
>>> To end up on the same brick….
>>
>>This is not possible is what I was attempting to state in the previous
>>mail. The regex filter is not for this purpose.
>>
>>The hash is always based on the name of the file, but the location is
>>based on the distribution/layout of the directory, which is different
>>for each directory based on its GFID.
>>
>>So there are no options in the code to enable what you seek at present.
>>
>>Why is this needed?
>>
>>Shyam
>
>_______________________________________________
>Gluster-devel mailing list
>Gluster-devel at gluster.org
>http://supercolony.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel at gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel at gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20141126/53899f0b/attachment-0001.html>


More information about the Gluster-devel mailing list