[Gluster-devel] locking

Thu Mar 1 17:07:05 UTC 2012

Please find answers inline.

On Wed, Feb 29, 2012 at 7:16 AM, Xavier Hernandez <xhernandez at datalab.es> wrote:
> Hello gluster developers,
>
> I'm working on a new translator that needs to implement locking strategies
> over inodes and directory entries for some file operations. The translator
> uses multiple subvolumes.
>
> I've been searching internet and browsing source code but there are some
> aspects that I can't completely understand.
>
> First of all I've noticed that there are 3 kinds of locks. Two of them are
> easy to understand: inode and entry locks. But there is another kind of lock
> that I don't understand how it is used or what is its purpose. It's called
> 'reserve lock'. Can you explain me what is the rationale of this lock, how
> it's used and what I have to do if I receive it from an upper translator ?

Reserve lock was an experimental feature introduced for the
requirements of an initial lock self healing design. It is unused now.

> Regarding inode locks, the structure gf_flock has basic fields that define
> the type and range of the lock (l_type, l_whence, l_start and l_len). There
> are two other fields (l_pid and l_owner) that I thought were used to
> identify the owner of the lock. However it seems to be not used in
> features/locks translator. It uses the pid and owner taken from
> call_stack_t.lk_owner and call_stack_t.pid from the current frame. Also,
> cluster/afr translator uses this fields when it creates locks.
>
> Are really l_pid and l_owner of structure gf_flock unused ?

l_pid and l_owner are really useful only in the GETLK call to identify
the owner of a lock on a given region. For SETLK operations it is
frame->root->{pid,lk_owner} that get used for the purpose of lock
granting.

> Another unrelated question... there are two fops very similar: readdir and
> readdirp. The only difference between them is an additional dict_t argument.
> It seems that it contains parameters, but I don't know to what purpose.
> While running my translator I only receive readdirp requests from upper
> translators, but the dict_t argument is always NULL. Is really readdir
> functionally equivalent to readdirp with this argument set to NULL ? Do I
> need to have any especial handling of this argument ?

The main difference between the two is that readdir returns entries
(just names) while readdirp returns along with entry names - handles
and attributes. Setting the dict_t parameter to NULL has no difference
in this behaviour. You can see the posix_readdir/p functions in
xlators/storage/posix/src/posix.c to see the actual difference between
the two calls.

Thanks
Avati