[Gluster-devel] [RFC] Enable message IDs in Gluster logs

Tue Dec 3 12:09:59 UTC 2013

Problem statement: Currently there are quite a slew of logs in Gluster that do not lend themselves to trivial analysis by various tools that help collect and monitor logs. 

This FEAT is to make this _tooling_ better by giving logs some form of message IDs so that the tools do not have to do complex log parsing to break it down to problem areas and suggest troubleshooting options.

It is also intended in this RFE, to generate a catalog of such IDs and messages, that can then be elaborated with various troubleshooting and resolution options.

The idea behind the catalog is that, at each release point we have a catalog for that release (updated during further maintenance releases), which can help point to some documentation on common errors and how to recover from them. So the message IDs do not have to be,
- unique across releases 
- possibly needs little reconciliation on an ongoing basis with the catalog, based on string changes
- does not have to be unique for each message, as the documentation can cover collisions to provide clarity

Proposed solution: Add a hash of the message format to the log message, to serve as the message ID.

IOW, the FMT in gf_log(dom, levl, fmt...) (and equivalents) is used to generate a 32bit hash and the log is printed with the hash. As an example,
I [socket.c:3533:socket_init][29004409] 0-testv-client-2: SSL support is NOT enabled

Advantages:
- Minimal code change (only the gf_log_* needs to change to further compute a hash of the message and add it to the log)
- No developers need to remember to generate a message ID before adding any messages etc.
- Cost of hashing during logging as computed using an example proves the addition to be negligible (to almost no impact whatsoever for about 10K messages printed)

Disadvantages:
- Hash collusion, hence 2 different messages having the same ID (but not an issue due to the catalog/documentation as above)
- Hash changes due to changes in messages
  - could be minor string edits lending to better messages, in which case hash may change and we need to retain the same documentation (reconciliation of catalog between releases)
  - could be major string changes or parameter additions, in which case both hash and the corresponding catalog documentation should change (hence not an issue)

Catalogue generation would have to parse the code to extract all used format strings (which is an interesting problem in itself).

This of this ala systemd/journald, without the message ID requirements from the same.

Shyam