[Gluster-devel] mount.glusterfs health check troubles - help appreciated

Fri May 6 10:34:02 UTC 2016

I'm currently trying to straighten out the encrypted transport
(SSL/TLS socket) code, and make it more robust, and work well with
IPv6 in particular [1]. When testing the changes, the mount.glusterfs
script cause some troubles.

The mount script tries to check if the mount is online by performing a
stat on the mount point after the glusterfs command returns, and
umounts if the stat fails. This is a check is racey and doesn't always
do the right thing.

The check is racey because it could be run before the client
translators have been able to connect to the bricks. The following
sequence of events happen when the mount happens, which help explain
the race.

- mount script runs the glusterfs command
- mount process fetches the volfile
- mount process initalizes the graph. The client xlator is also
initialized now, but the connections aren't started.
- mount process sends a PARENT_UP event to the graph. The client now
begins the connection process (portmap first, followed by connecting
to the brick). It's not guaranteed yet if the connection happened.
- mount process returns
- mount script does a stat on mount point to check health

In an environment (like the on I'm testing in) the connection couldn't
be completed by the time the health check is done. In my environment,
the client connection sequence is as follows,
- the portmap connection is started
 - the first address returned for the hostname is a IPv6 address. With
the IPv6 change that was merged recently name lookups are done with
AF_UNSPEC, which return IPv6. My envrionment returns v6 addresses
first for getaddrinfo calls (which I think is the default for a lot of
environments)
 - the connection fails as glusterd doesn't listen on IPv6 addresses
(it listens on 0.0.0.0 which v4 only)
 - a reconnection is made with the next address. This takes a while
because of the encrypted transports.
 - portmap query is done after connection is established and the port
is obtained
- the client xlator now reconnects to the obtained port.
 - (same above cycle of connection/reconnection happens)
- once connection is established, handshakes are done
- CHILD_UP event is sent

After this point the client xlator becomes usable.

But this is not reached before the mount script does the health check
in my environment. So the mount ends up being terminated.

Now the simplest solution would be to sleep for some time before doing
the check to give the xlators time to get ready. But this is
non-deterministic and isn't something I'm very fond of.

This turning out to be problematic in my very simple environment, and
I think it's gonna be a bigger problem in larger more complex
environments. My environment is,
- single node
- single brick volume
- client is the same node
- IO transport encryption is on
- Management transport encryption is on
- IPv6 enabled in kernel, no actual IPv6 network is in place
(disabling IPv6 in kernel causes the problem to stop, but I want to
test with IPv6)

Does anyone else have ideas on how to fix this? (For now I've disabled
this check in the script).

~kaushal

[1]: https://review.gluster.org/#/q/status:open+project:glusterfs+branch:master+topic:bug-1333317