[Gluster-devel] Adding ALUA support for Gluster-Block
skoduri at redhat.com
Mon Oct 29 17:36:56 UTC 2018
On 10/29/18 4:15 PM, Susant Palai wrote:
> I would be interested to know if you can use leases/delegations to solve
> the issue. If you can not, can leases/delegations be extended instead of
> proposing an new API?
> From what I understand Block-D keeps all the file open bfore beginning
> of the session (exporting file as block devices). Which I guess won't
> work with lease, since
> lease I guess(please correct me if wrong) breaks the existing lease on
> an open request. Certainly, with selfheal daemon the lease will be
> released. Hence, mandatory lock fits here IMO.
Right. Leases are mostly useful when there is data caching needed for a
single write or multiple-readers case. Here IIUC, the issue being solved
is to avoid data corruption post failover/failback of the switch.
> @Kalever, Prasanna <mailto:pkalever at redhat.com> Please give your
> feedback here.
> From theory, the high-available NFS-Ganesha and Samba services should
> have solved similar problems already.
> From what I understand the multipath layer does not have any control
> over restarting tcmu-runner on Gluster side (If that is how NFS_Ganesha
> and Samba provides blacklist for it's clients).
> The targetcli does certain tasks only on failover switch which would be
> like taking mandatory lock, open a session as mentioned in the design
> doc. Hence, no control over data cached at Gluster-client layer to be
> replayed in the event of a disconnection.
NFS servers solve this by putting servers into grace and allowing
clients to reclaim their lost state post failover and failback.
Internally since NFS-ganesha stacks on top of gfapi, it as well would
need reclaim lock support in gfapi to acquire lost state from another
NFS server (but certainly not the way current implementation is being
done ). I had left the comments in the patch. The current approach
shall make all gfapi applications vulnerable and as Amar mentioned, it
could lead to potential CVE.
To solve it, gluster-block could agree upon some common lk-owner (unique
to that initiator) and that way gfapi need not fetch it and can prevent
other non-trusted clients from acquiring that lock by force.
Coming to other problem quoted in the design doc - replaying fops by
gfapi clientA (nodeA) after nodeB relinquishes lock, I have couple of
questions regarding the same on who shall be responsible for replaying
those fops. - commented on the doc
Once gfapi_client/nodeA disconnects and reconnects, and if tcmu-runner
replays the fop, wouldn't it need to reopen the fd as there was network
disconnect and old fd had gone stale. If it reopens fd, it will get new
generation no./epoch time and will allow it replay old pending fops right?
More information about the Gluster-devel