[Gluster-devel] regarding GF_CONTENT_KEY and dht2 - perf with small files
Shyam
srangana at redhat.com
Thu Feb 4 06:04:04 UTC 2016
On 02/04/2016 09:38 AM, Vijay Bellur wrote:
> On 02/03/2016 11:34 AM, Venky Shankar wrote:
>> On Wed, Feb 03, 2016 at 09:24:06AM -0500, Jeff Darcy wrote:
>>>> Problem is with workloads which know the files that need to be read
>>>> without readdir, like hyperlinks (webserver), swift objects etc. These
>>>> are two I know of which will have this problem, which can't be improved
>>>> because we don't have metadata, data co-located. I have been trying to
>>>> think of a solution for past few days. Nothing good is coming up :-/
>>>
>>> In those cases, caching (at the MDS) would certainly help a lot. Some
>>> variation of the compounding infrastructure under development for Samba
>>> etc. might also apply, since this really is a compound operation.
Compounding in this case can help, but still without the cache, the read
has to go to the DS, and on such a compounding, the MDS would reach out
to the DS for the information than the client. Another possibility based
on what we decide as the cache mechanism.
>>
>> When a client is done modifying a file, MDS would refresh it's size,
>> mtime
>> attributes by fetching it from the DS. As part of this refresh, DS could
>> additionally send back the content if the file size falls in range, with
>> MDS persisting it, sending it back for subsequent lookup calls as it does
>> now. The content (on MDS) can be zapped once the file size crosses the
>> defined limit.
Venky, when you say persisting, I assume on disk, is that right?
If so, then the MDS storage size requirements would increase (based on
amount of file data that need to be stored). As of now it is only
inodes, and as we move to a db a record. In this case we may have
*fatter* MDS partitions. Any comments/thoughts on that?
As with memory I would assume some form of eviction of data from MDS, to
control the space utilization here as a possibility.
>>
>
> I like the idea. However the memory implications of maintaining content
> in MDS is something to watch out for. quick-read is interested in files
> of size 64k by default and with a reasonable number of files in that
> range, we might end up consuming significant memory with this scheme.
Vijay, I think what Venky states is to stash the file on the local
storage and not in memory. If it was in memory then brick process
restarts would nuke the cache, and either we need mechanisms to
rebuild/warm the cache or just start caching afresh.
If we were caching in memory, then yes the concern is valid, and one
possibility is some form of LRU for the same, to keep memory
consumption in check.
Overall I would steer away from memory for this use case, and use the
disk, as we do not know which files to cache (well in either case, but
disk offers us more space to possibly punt on that issue). For files
where the cache is missing and the file is small enough, either perform
async read from the client (gaining some overlap time with the app) or
just let it be, as we would get the open/read anyway, but would slow
things down.
>
> -Vijay
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
More information about the Gluster-devel
mailing list