[Gluster-devel] lseek

Xavier Hernandez xhernandez at datalab.es
Mon May 14 11:47:10 UTC 2012


Hello Ian,

I didn't thought in statfs. In this special case things are a bit harder 
for a compression translator. I think it's impossible to return accurate 
data without a considerable amount of work.

Maybe some estimation of the available space based on the current 
achieved mean compression ratio would be sufficient, but never accurate. 
With more work you could even be able to say exactly how much space have 
been used, but the best you can do with the remaining space is an 
estimation.

Regarding lseek, there isn't a map with lookup. Probably I haven't 
explained it as well as I wanted.

There are basically two kinds of user mode calls. Those that use a 
string containing a filename to operate with (stat, unlink, open, creat, 
...), and those that use a file descriptor (fstat, read, write, ...). 
The kernel does not work with names to handle files, so it has to 
translate the names to inodes to work with them. This means that any 
call that uses a string will need to make a "lookup" to get the 
associated inode (the only exception is creat, that creates a new inode 
without using lookup).

This means that every filename based operation can generate a lookup 
request (although some caching mechanism may reduce the number of 
calls). All operations that work with a file descriptor do not generate 
a lookup request, because the file descriptor is already bound to an inode.

In your particular case, to do an lseek you must have made a previous 
call to open (that would have generated a lookup request) or creat.

Hope this better explains how kernel and gluster are bound...

Xavi

On 05/14/2012 01:18 PM, Ian Latter wrote:
> Hello Xavier,
>
>
>     I don't have a problem with the principles, these
> were effectively how I was traveling (the notable
> difference is statfs which I want to pass-through
> unaffected, reporting the true file system capacity
> such that a du [stat] may sum to a greater value
> than a df [statfs]).  In 2009 I had a mostly-
> functional hashing write function and a dubious
> read function (I stumbled when I had to open a
> file from within a fop).
>
>    But I think what you're telling/showing me is that
> I have no deep understanding of the mapping of
> the system calls to their Fuse->Gluster fops -
> which is expected :)  And, this is a better outcome
> than learning that Gluster has gaps in its
> framework with regard to my objective.  I.e. I
> didn't know that lseek mapped to lookup.  And
> the examples aren't comprehensive enough
> (rot-13 is the only one that really manipulates
> content, and it only plays with read and write,
> obviously because it has a 1:1 relationship with
> the data).
>
> This is the key, and not something that I was
> expecting;
>
>> In gluster there are a lot of fops that return a iatt
>> structure. You must guarantee that all these
>> functions return the correct size of the file in
>> the field ia_size to be sure that everything works
>> as expected.
> I'll do my best to build a comprehensive list of iatt
> returning fops from the examples ... but I'd say it'll
> take a solid peer review to get this hammered out
> properly.
>
> Thanks for steering me straight Xavi, appreciate
> it.
>
>
>
> ----- Original Message -----
>> From: "Xavier Hernandez"<xhernandez at datalab.es>
>> To: "Ian Latter"<ian.latter at midnightcode.org>
>> Subject:  Re: [Gluster-devel] lseek
>> Date: Mon, 14 May 2012 12:29:54 +0200
>>
>> Hello Ian,
>>
>> lseek calls are handled internally by the kernel and they
> never reach
>> the user land for fuse calls. lseek only updates the
> current file offset
>> that is stored inside the kernel file's structure. This
> value is what is
>> passed to read/write fuse calls as an absolute offset.
>>
>> There isn't any problem in this behavior as long as you
> hide all size
>> manipulations from fuse. If you write a translator that
> compresses a
>> file, you should do so in a transparent manner. This
> means, basically, that:
>> 1. Whenever you are asked to return the file size, you
> must return the
>> size of the uncompressed file
>> 2. Whenever you receive an offset, you must translate that
> offset to the
>> corresponding offset in the compressed file and work with that
>> 3. Whenever you are asked to read or write data, you must
> return the
>> number of uncompressed bytes read or written (even if you
> have
>> compressed the chunk of data to a smaller size and you
> have physically
>> written less bytes).
>> 4. All read requests must return uncompressed data (this
> seems obvious
>> though)
>>
>> This guarantees that your manipulations are not seen in
> any way by any
>> upper translator or even fuse, thus everything should work
> smoothly.
>> If you respect these rules, lseek (and your translator)
> will work as
>> expected.
>>
>> In particular, when a user calls lseek with SEEK_END, the
> kernel takes
>> the size of the file from the internal kernel inode's
> structure. This
>> size is obtained through a previous call to lookup or
> updated using the
>> result of write operations. If you respect points 1 and 3,
> this value
>> will be correct.
>>
>> In gluster there are a lot of fops that return a iatt
> structure. You
>> must guarantee that all these functions return the correct
> size of the
>> file in the field ia_size to be sure that everything works
> as expected.
>> Xavi
>>
>> On 05/14/2012 11:51 AM, Ian Latter wrote:
>>> Hello Xavi,
>>>
>>>
>>>     Ok - thanks.  I was hoping that this was how read
>>> and write were working (i.e. with absolute offsets
>>> and not just getting relative offsets from the current
>>> seek point), however what of the raw seek
>>> command?
>>>
>>>        len = lseek(fd, 0, SEEK_END);
>>>
>>>        Upon  successful completion, lseek() returns
>>>        the resulting offset location as measured in
>>>        bytes from the beginning of the  file.
>>>
>>>     Any idea on where the return value comes from?
>>> I will need to fake up a file size for this command ..
>>>
>>>
>>>
>>> ----- Original Message -----
>>>> From: "Xavier Hernandez"<xhernandez at datalab.es>
>>>> To:<gluster-devel at nongnu.org>
>>>> Subject:  Re: [Gluster-devel] lseek
>>>> Date: Mon, 14 May 2012 09:48:17 +0200
>>>>
>>>> Hello Ian,
>>>>
>>>> there is no such thing as an explicit seek in glusterfs.
>>> Each readv,
>>>> writev, (f)truncate and rchecksum have an offset parameter
>>> that tells
>>>> you the position where the operation must be performed.
>>>>
>>>> If you make something that changes the size of the file
>>> you must make it
>>>> in a way that it is transparent to upper translators. This
>>> means that
>>>> all offsets you will receive are "real" (in your case,
>>> offsets in the
>>>> uncompressed version of the file). You should calculate in
>>> some way the
>>>> equivalent offset in the compressed version of the file
>>> and send it to
>>>> the correspoding fop of the lower translators.
>>>>
>>>> In the same way, you must return in all iatt structures
>>> the real size of
>>>> the file (not the compressed size).
>>>>
>>>> I'm not sure what is the intended use of NONSEEKABLE, but
>>> I think it is
>>>> for special file types, like devices or similar that are
>>> sequential in
>>>> nature. Anyway, this is a fuse flag that you can't return
>>> from a regular
>>>> translator open fop.
>>>>
>>>> Xavi
>>>>
>>>> On 05/14/2012 03:22 AM, Ian Latter wrote:
>>>>> Hello,
>>>>>
>>>>>
>>>>>      I'm looking for a seek (lseek) implementation in
>>>>> one of the modules and I can't see one.
>>>>>
>>>>>      Do I need to care about seeking if my module
>>>>> changes the file size (i.e. compresses) in Gluster?
>>>>> I would have thought that I did except that I believe
>>>>> that what I'm reading is that Gluster returns a
>>>>> NONSEEKABLE flag on file open (fuse_kernel.h at
>>>>> line 149).  Does this mitigate the need to correct
>>>>> the user seeks?
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ian Latter
>>>>> Late night coder ..
>>>>> http://midnightcode.org/
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-devel mailing list
>>>>> Gluster-devel at nongnu.org
>>>>> https://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at nongnu.org
>>>> https://lists.nongnu.org/mailman/listinfo/gluster-devel
>>>>
>>> --
>>> Ian Latter
>>> Late night coder ..
>>> http://midnightcode.org/
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at nongnu.org
>>> https://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>
> --
> Ian Latter
> Late night coder ..
> http://midnightcode.org/
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> https://lists.nongnu.org/mailman/listinfo/gluster-devel





More information about the Gluster-devel mailing list