[Gluster-devel] Upcalls Infrastructure

Thu Apr 16 09:56:09 UTC 2015

Hi,

We made few changes in the design of the lease-locks as per the latest 
discussion we had (thanks to everyone involved). Please find the details 
in the below section and provide your comments.

http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure#delegations.2Flease-locks

In addition, (thanks to AFR team), we have formulated an approach to 
filter out duplicate callback notifications.

http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure#Dependencies

<<<<
Filter-out duplicate notifications
     Incase of replica bricks maintained by AFR/EC, the upcalls state is 
maintained and processed on all the replica bricks. This will result in 
duplicate notifications to be sent by all those bricks incase of 
non-idempotent fops. Also in case of distributed volumes, cache 
invalidation notifications on a directory entry will be sent by all the 
bricks part of that volume. Hence We need support to filter out such 
duplicate callback notifications.

     The approach we shall take to address it is that,

     add a new xlator on the client-side to track all the fops. Maybe 
create a unique transaction id and send it to the server.
     Server needs to store this transaction id in the client info as 
part of upcall state.
     While sending any notifications, add this transaction id too to the 
request.
     Client (the new xlator) has to filter out duplicate requests based 
on the transaction ids received.
 >>>>

Due to time-constraints, we may add the support to filter out duplicate 
notifications in the subsequent releases of 3.7.

Below link contains the list of TODO items related to this feature -

http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure#TODO

Let me know if any of you are interested to work on any of those items. 
Would be happy to assist you :)

Thanks,
Soumya

On 01/22/2015 02:31 PM, Soumya Koduri wrote:
> Hi,
>
> I have updated the feature page with more design details and the
> dependencies/limitations this support has.
>
> http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure#Dependencies
>
>
> Kindly check the same and provide your inputs.
>
> Few of them which may be addressed for 3.7 release are -
>
> *AFR/EC*
>      - Incase of replica bricks maintained by AFR, the upcalls state is
> maintained and processed on all the replica bricks. This will result in
> duplicate notifications sent by all those bricks incase of
> non-idempotent fops.
>      - Hence we need support on AFR to filter out such duplicate
> callback notifications. Similar support is needed for EC as well.
>      - One of the approaches suggested by the AFR team is to cache the
> upcall notifications received for around 1min (their current lifetime)
> to detect & filter out the duplicate notifications sent by the replica
> bricks.
>
>
> *Cleanup during network disconnect - protocol/server*
>     - At present, incase of network disconnects between the
> glusterfs-server and the client, the protocol/server looks up the fd
> table associated with that client and sends 'flush' op for each of those
> fds to cleanup the locks associated with it.
>
>     - We need similar support to flush the lease locks taken. Hence,
> while granting the lease-lock, we plan to associate that upcall_entry
> with the corresponding fd_ctx or inode_ctx so that they can be easily
> tracked if needed to be cleaned up. Also it will help in faster lookup
> of the upcall entries while trying to process the fops using the same
> fd/inode.
>
> Note: Above cleanup is done for the upcall state associated with only
> lease-locks. For the other entries maintained (for eg:, for
> cache-invalidations), the reaper thread (which will be used to cleanup
> the expired entries in this xlator) will clean-up those states as well
> once they get expired.
>
> *Replay of the lease-locks state*
>    - At present, replay of locks by the client xlator (after network
> disconnect and reconnect) seems to have been disabled.
>    - But when it is being enabled, we need to add support to replay
> lease-locks taken as well.
>    - Till then, this will be considered as a limitation and will be
> documented as suggested by KP.
>
> Thanks,
> Soumya
>
>
> On 12/16/2014 09:36 AM, Krishnan Parthasarathi wrote:
>>>>
>>>> - Is there a new connection from glusterfsd (upcall xlator) to
>>>>     a client accessing a file? If so, how does the upcall xlator reuse
>>>>     connections when the same client accesses multiple files, or
>>>> does it?
>>>>
>>> No. We are using the same connection which client initiates to send-in
>>> fops. Thanks to you for pointing me initially to the 'client_t'
>>> structure. As these connection details are available only in the server
>>> xlator, I am passing these to upcall xlator by storing them in
>>> 'frame->root->client'.
>>>
>>>> - In the event of a network separation (i.e, a partition) between a
>>>> client
>>>>     and a server, how does the client discover or detect that the
>>>> server
>>>>     has 'freed' up its previously registerd upcall notification?
>>>>
>>> The rpc connection details of each client are stored based on its
>>> client-uid. So incase of network partition, when client comes back
>>> online, IMO it re-initiates the connection (along with new client-uid).
>>
>> How would a client discover that a server has purged its upcall entries?
>> For instance, a client could assume that the server would notify it about
>> changes as before (while the server has purged the client's upcall
>> entries)
>> and assume that it still holds the lease/lock. How would you avoid that?
>>
>>> Please correct me if that's not the case. So there will new entries
>>> created/added in this xlator. However, we still need to decide on how to
>>> cleanup the old-timed-out and stale entries
>>>     * either clean-up the entries as and when we find any expired
>>> entry or
>>> stale entry (in case if notification fails).
>>>     * or by spawning a new thread which periodically scans through this
>>> list and cleans up those entries.
>>
>> There are couple of things to resource cleanup in this context.
>> 1) Time to cleanup; For e.g, on expiry of a timer.
>> 2) Order of cleaning up; This involves clearly establishing relationships
>>     among inode, upcall entry and client_t(s). We should document this.
>>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel