[Gluster-devel] RFC on posix locks migration to new graph after a switch

Wed Jun 20 19:37:11 UTC 2012

Avati,

We had relied on posix lock-healing (here after locks refer to posix locks) done by protocol/client for lock migration to new graph. Lock healing is a feature implemented by protocol/client which simply reacquires all the granted locks stored in fd context after a reconnect to server. The way we leverage this lock healing feature of protocol client to migrate posix locks to new graph is: we migrate fds to new graph by opening a new fd on the same file in new graph (with fd context copied from old graph) and protocol/client reacquires all the granted locks in fd context. But, this solution has following issues:

1.If we open fds in new graph even before cleaning up of the old-transport, lock requests sent by protocol/client as part of healing will conflict with locks held on old-tranport and hence will fail (Note that with only client-side graph switch there is a single inode on server corresponding to two inodes - one corresponding to each of old and new graphs - on client). As a result locks are not migrated. The problem could've been solved if protocol/client had issued SETLKW requests instead of SETLK (the lock requests issued as part of healing would be granted when old-transport disconnects eventually). But, that has different set of issues. Even then, this is not a fool-proof solution, since there might already be other conflicting lock requests in the lock wait queue when protocol/client starts lock healing resulting in failure of lock-heal.

2. If we open fds in new graph after cleaning of old-transport, there is a window of time b/w old-tranport cleanup and lock-heal in new graph where potentially conflicting lock requests could be granted, there by causing lock requests sent as part of lock healing to fail.

One solution I can think of is to bring in a SETLK_MIGRATE lock command. SETLK_MIGRATE takes a transport identifier as a parameter along with usual arguments SETLK/SETLKW take (like lock range, lock-owner etc). SETLK_MIGRATE command migrates a lock from the transport passed as a parameter to the transport on which request came in, if two locks conflict only because they came from two different transports (all else - lock-range, lock-owner etc - being same). On absence of any live locks, SETLK_MIGRATE behaves similar to SETLK command.

protocol/client can make use of this SETLK_MIGRATE command in lock requests it sends as part of lock heal during open fop to migrate locks to new graph. Assuming that old-transport is not cleaned up at the time of lock-heal, SETLK_MIGRATE atomically migrates locks from old-transport to new-transport (on server). Now, the difficulty is in getting the identifier to old-transport on server from which locks are currently held. This can be solved if we store the peer transport identifier in lk-context on client (which can be easily obtained in an lk reply). We can pass the same transport identifier to server during healing.

I haven't yet completely thought of some issues like whether protocol/client can unconditionally use SETLK_MIGRATE in all lock requests it sends as part of healing or it should use SETLK_MIGRATE only during first attempt of healing after a graph-switch. However even if protocol/client wants to make such distinction, it can be easily worked out (either by fuse setting a special "migrate" key in xdata of open calls it sends as part of fd-migration or some different mechanism).

Please let me know your thoughts on this.

regards,
Raghavendra.