[Gluster-devel] Some updates on the eventing framework for Gluster

Wed Dec 2 07:34:55 UTC 2015

On 12/02/2015 10:31 AM, Aravinda wrote:
> Hi Samikshan,
>
> Thanks for the updates. Looks like current design is very much tied
> with dbus, dbus should be one of the notification mechanism, but
> Eventing infrastructure should support multiple notification
> mechanisms in future(like email, running scripts, websockets etc).
> We should have plugin kind of architecture fordata collection and
> notification system.
>
> We may not need Kafka for storing cluster wide events, we can maintain
> State table in Glusterd 2.0 distributed store.(History of events can't
> be stored in Glusterd 2.0 distributed store but Cluster wide State can
> be maintained)
>
>
> Sharing my thoughts about Eventing Infrastructure for Gluster,
>

Hi Aravinda,

Thanks a lot for your inputs. My responses can be found inline.

> Data Collection Plugins
> -----------------------
> Plugins to collect and post-process data from CLI command, log files
> or any other sources.
>
> For example, Volume life cycle events can be captured using Hook
> scripts, we can get the state of Volume, like new Volume is created,
> Started, Stopped, Deleted etc.
>

That is how storaged currently gets notified about volume life cycle 
events. The dbus method call is made that tells storaged to reload its 
state.

dbus-send --system                               \
--dest=org.storaged.Storaged --type=method_call  \
/org/storaged/Storaged/Manager                   \
org.storaged.Storaged.Manager.GlusterFS.Reload

All property changes are subsequently sent out as dbus-signals. However, 
currently this reloads the entire glusterfs state that is maintained by 
storaged instead of reloading only the state of the volume that has been 
changed. Is there any way by which I could send the volume name as a 
parameter of the dbus method call?

> Brick Health can be captured by watching Glusterd log files to get
> Brick down/up notifications.
>
> Geo-replication failures can be captured by analyzing Geo-rep logs.
>
> As part of writing plugins we need to identify missing information in
> log files. For example, Volume set command logs key and Value as debug
> messages but does not log to which Volume vol set command is issued.
>

Adding appropriate hook scripts as with gluster volume life cycle can be 
used to notify storaged that volume options have been set. storaged can 
then process the xml information from "gluster volume get VOLNAME OPTION 
--xml" to update the properties of the respective volume object.

> Gluster Eventing is run as daemon(Systemd service), when started it
> rebuilds the state table using CLI commands and then starts watching
> log files or hooks.
>

I'm not sure if I'm misunderstanding, but storaged can hold information 
about gluster state (getting the information through gluster CLI and 
hooks) through the exported dbus objects.

> Distributed Store as State Table
> --------------------------------
> When data is collected from plugins described above, it should be
> updated to Cluster wide state Table. This is required to avoid
> duplicate notifications from same node and from multiple nodes.
>

I would assume that redundant notifications would come from other nodes 
when they are updated with the cluster wide state. In such cases there 
has to be intelligence implemented in the producers sending out 
notifications to whatever we use to deal with the cluster-wide messaging 
(Kafka/RabbitMQ/whatever), by taking into account the previous state 
that is logged.

As an example, when a gluster volume is started, the node where this 
command is executed will send out a dbus signal citing property change 
in a particular dbus object. The producer running on this node can then 
publish a corresponding message (key='status' and value=1) to a topic 
named after the name of the volume. Consumers running on other nodes 
would then get this information and update the gluster state accordingly 
and this would send out dbus signals since there has been changes in 
dbus properties. However, the producer running on these other nodes 
sould check the last message under the topic and choose to publish or 
not publish new messages under that topic.

> For example, During each heartbeat Brick down notification repeats in
> log till that brick comes up. If we consume the logs and send
> notification we end up sending too many notification. With State table
> we can check the previous state, if Current state is different from
> Previous state then send notification.
>

For notification regarding brick failures, it might not be impossible to 
get that information through storaged if the path of the brick can be 
mapped to the dbus object corresponding to the disk drive where this 
brick resides. Once that mapping can be done, notification regarding 
failures can be sent as mentioned in my previous message.

> Local state db is sufficient for above example, but in many case same
> event repeats in all the nodes of Cluster. Distributed store is used
> to prevent duplicate notification across nodes.
>
> For POC we can use consul/etcd as distributed store, once Glusterd 2.0
> implements distributed store we can migrate to that store.
>
> In case of Node Event, (For Example, Brick process going down)
>
>      if node_event && node_event_status != prev_status ->
>          get_more_info() # Using CLI command
>          send_notification()
>
> In case of Cluster Event, (For example Volume Start/Stop/Create/Delete)
>
>      if cluster_event && cluster_event_status != prev_status ->
>          get_more_info()
>          LOCK_STORE_OR_WAIT
>          st = get_status_from_store()
>          if st != cluster_event_status ->
>              set_store_status()
>              send_notification()
>          LOCK_RELEASE
>
> Note: If consumers expects Cluster event notification from all nodes
> then we may not need distributed store, local node may be sufficient.
>
> Notification Plugins
> --------------------
> Notifications can be dbus events, email notification or websocket
> notification.
>
> Notification can be divided as different Channels so that consumers
> can subscribe to channels which they are interested in.
>
> My previous experiments about Monitoring Gluster,
> 1. Effective GlusterFs monitoring using hooks -
> http://aravindavk.in/blog/effective-glusterfs-monitoring-using-hooks/
> 2. Introducing gdash - GlusterFS Dashboard -
> http://aravindavk.in/blog/introducing-gdash/
>
> regards
> Aravinda
>
> On 12/02/2015 06:47 AM, Nagaprasad Sathyanarayana wrote:
>> Any specific reasons for going with Kafka? What is the advantage of
>> using Kafka over RabbitMQ?
>>
>> Thanks
>> Naga
>>
>>
>>> On Dec 2, 2015, at 6:09 AM, Samikshan Bairagya <@redhat.com> wrote:
>>>
>>> Hi,
>>>
>>> The updates for the eventing framework for gluster can be divided
>>> into the following two parts.
>>>
>>> 1. Bubbling out notifications through dbus signals from every gluster
>>> node.
>>>
>>> * The 'glusterfs' module in storaged [1] exports objects on the
>>> system bus for every gluster volume. These objects hold the following
>>> properties:
>>> - Name
>>> - Id
>>> - Status (0 = Created, 1 = Started, 2 = Stopped)
>>> - Brickcount
>>>
>>> * A singleton dbus object corresponding to glusterd is also exported
>>> by storaged on the system bus. This object holds properties to track
>>> the state of glusterd (LoadState and ActiveState).
>>>
>>> 2. Aggregating all these signals from each node over an entire cluster.
>>>
>>> * Using Kafka [2] for messaging over a cluster: Implementing a (dbus
>>> signal) listener in python that converts these dbus signals from
>>> objects to 'keyed messages' in Kafka under a particular 'topic'.
>>>
>>> For example, if a volume 'testvol' is started, a message is published
>>> under topic 'testvol', with 'status' as the 'key' and the changed
>>> status ('1' in this case) as the 'value'.
>>>
>>>
>>> *** Near term plans:
>>> - Export dbus objects corresponding to bricks.
>>> - Figure out how to map the path to the brick directory to the block
>>> device and consequently the drive object. The 'SmartFailing' property
>>> from org.storaged.Storaged.Drive.Ata [3] interface can then be used
>>> to track brick failures.
>>> - Make the framework work over a multi-node cluster with possibly a
>>> multi-broker kafka setup to identify redundancies as well as to keep
>>> consistent information across the cluster.
>>>
>>> Views/feedback/queries are welcome.
>>>
>>> [1] https://github.com/samikshan/storaged/tree/glusterfs
>>> [2] http://kafka.apache.org/documentation.html#introduction
>>> [3]
>>> http://storaged-project.github.io/doc/latest/gdbus-org.storaged.Storaged.Drive.Ata.html#gdbus-property-org-storaged-Storaged-Drive-Ata.SmartFailing
>>>
>>>
>>> Thanks and Regards,
>>>
>>> Samikshan
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>

Thanks and Regards,

Samikshan