[Gluster-devel] glustershd status
Harshavardhana
harsha at harshavardhana.net
Thu Jul 17 04:54:49 UTC 2014
So here is what i found long email please bare with me
Looks like the management daemon and these other daemons
eg: brick, nfs-server, gluster self-heal daemon
They work in non-blocking manner, as in notifying back to Gluster
management daemon when they are available and when they are not. This
is done through a notify() callback mechanism
A registered notify() handler is supposed to call setter() functions
which update the state of the notified instance with in gluster
management daemon
Taking self-heal daemon as an example:
"conf->shd->online" ---> is the primary value which should be set
during this notify call back where self-heal-daemon informs of its
availability to Gluster management daemon - this happens during a
RPCCLNT_CONNECT event()
During this event glusterd_nodesvc_set_online_status() sets all the
necessary state online/offline.
What happens in FreeBSD/NetBSD is that this notify event doesn't occur
at all for some odd reason - there in-fact a first notify() event but
that in-fact sets the value as "offline" i.e status == 0 (gf_boolean_t
== _gf_false)
In-fact this is true on Linux as well - there is smaller time window
observe the below output , immediately run 'volume status' after a
'volume start' event
# gluster volume status
Status of volume: repl
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 127.0.1.1:/d/backends/brick1 49152 Y 29082
Brick 127.0.1.1:/d/backends/brick2 49153 Y 29093
NFS Server on localhost N/A N N/A
Self-heal Daemon on localhost N/A N N/A
Task Status of Volume repl
------------------------------------------------------------------------------
There are no active volume tasks
Both these commands are 1 sec apart
# gluster volume status
Status of volume: repl
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 127.0.1.1:/d/backends/brick1 49152 Y 29082
Brick 127.0.1.1:/d/backends/brick2 49153 Y 29093
NFS Server on localhost 2049 Y 29115
Self-heal Daemon on localhost N/A Y 29110
Task Status of Volume repl
------------------------------------------------------------------------------
There are no active volume tasks
So the change happens but sadly this doesn't happen on non-Linux
platform, my general speculation is that this is related to
poll()/epoll() - i have to debug this further.
In-fact restarting 'gluster management daemon' fixes these issues
which is understandable :-)
On Wed, Jul 16, 2014 at 9:41 AM, Emmanuel Dreyfus <manu at netbsd.org> wrote:
> Harshavardhana <harsha at harshavardhana.net> wrote:
>
>> Its pretty much the same on FreeBSD, i didn't spend much time debugging
>> it. Let me do it right away and let you know what i find.
>
> Right. Once you will have this one, I have Linux-specific truncate and
> md5csum replacements to contribute. I do not send them now since I
> cannot test them.
>
>
> --
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> manu at netbsd.org
--
Religious confuse piety with mere ritual, the virtuous confuse
regulation with outcomes
More information about the Gluster-devel
mailing list