[Gluster-devel] Logging framework in Gluter 4.0

Tue Nov 10 16:14:58 UTC 2015

I think most of my responses seem to indicate that we should stick to the current logging mechanism, so let me clarify :)

1) Message ID as one of the KV pairs of the message helps keep this as an invariant and also helps messages worded similarly to disambiguate themselves using the message ID

2) We do need transaction ID (or some such) as another KV pair, I agree to this

3) We need extensible KV pairs possibly to address such needs in the future

4) We need to separate message string with its other elements, like errno (which I believe we already have)

5) We need pluggable logging frameworks, so that we can use what suits best in an environment. For example some may want centralized logging, others may be content with how we log now.

Now, I believe, from a code development POV, we need to keep this simple, and have other complexity inside the logging framework. Adding more variables to gf_msg is one way forward, or providing a free string like sd_journal_send [1] is another. I would very much like to enforce some parameters, say in the form of, must send errno, message ID, etc. but can send additional fields like sd_journal_send, hence not go all the way like sd_journal_send, but add that capability to gf_msg (or the likes).

Overall, if we shift to adding mode KV pairs as "KEY=value" at the end of (say) gf_msg then we can change code within logging.c to split this up as fields.

Hope this explains my stance better.

Shyam

[1] http://www.freedesktop.org/software/systemd/man/sd_journal_send.html

----- Original Message -----
> From: "Shyam" <srangana at redhat.com>
> To: "Kaushal M" <kshlmster at gmail.com>, "Atin Mukherjee" <atin.mukherjee83 at gmail.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Tuesday, November 10, 2015 9:56:32 AM
> Subject: Re: [Gluster-devel] Logging framework in Gluter 4.0
> 
> On 11/10/2015 01:36 AM, Kaushal M wrote:
> > On Tue, Nov 10, 2015 at 12:05 PM, Kaushal M <kshlmster at gmail.com> wrote:
> >> On Fri, Nov 6, 2015 at 10:45 PM, Atin Mukherjee
> >> <atin.mukherjee83 at gmail.com> wrote:
> >>> -Atin
> >>> Sent from one plus one
> >>>
> >>>
> >>> On Nov 6, 2015 7:50 PM, "Shyam" <srangana at redhat.com> wrote:
> >>>>
> >>>> On 11/06/2015 06:58 AM, Atin Mukherjee wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 11/06/2015 01:30 PM, Aravinda wrote:
> >>>>>>
> >>>>>>
> >>>>>> regards
> >>>>>> Aravinda
> >>>>>> http://aravindavk.in
> >>>>>>
> >>>>>> On 11/06/2015 12:28 PM, Avra Sengupta wrote:
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> As almost all the components targeted for Gluster 4.0 have moved from
> >>>>>>> design phase to implementation phase on some level or another, I feel
> >>>>>>> it's time to get some consensus on the logging framework we are going
> >>>>>>> to use. Are we going to stick with the message ID formatted logging
> >>>>>>> framework in use today, or shall we move on to a better solution.
> >>>>
> >>>>
> >>>> I would prefer to think that we extend the log messages with what we
> >>>> want.
> >>>> Message IDs are there for a purpose as outlined in earlier mails on the
> >>>> same
> >>>> topic, so we [w|s]hould stick with message IDs and add more is how I
> >>>> think
> >>>> of the problem.
> >>>>
> >>>> If we take inspiration from other frameworks, I would say we improve
> >>>> what
> >>>> we log.
> >>>>
> >>>> Where we log is made pluggable in the current framework, although as we
> >>>> add more logging frameworks (like rsyslog or lumberjack etc.) the
> >>>> abstraction provided for plugging in these could improve. This is a
> >>>> contained change in the logging.c file though.
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Some good to have features I would like to have in the logging
> >>>>>>> framework are:
> >>>>>>> 1. Specific distinction between logs of various transactions (or
> >>>>>>> operations), that would make it a lot easier to make sense than
> >>>>>>> looking at some serialized log dump
> >>>>>>
> >>>>>> Externally grouping Message IDs may help in this situation.
> >>>>>
> >>>>> Transaction Id is definitely a *must have* requirement when 4.0's theme
> >>>>> is scalability where we are talking about number of servers been
> >>>>> thousands in the configuration with out which analysis of an issue will
> >>>>> be a nightmare. GlusterD 2.0 is definitely going to have logs with txn
> >>>>> ids.
> >>>>
> >>>>
> >>>> +1
> >>>>
> >>>> I thought this was discussed in another thread on the lists, not able to
> >>>> find it, a link would help.
> >>>>
> >>>> The question is, should this be automated by the logging framework, or
> >>>> something each xlator should do?
> >>>>
> >>>> I prefer automating this, but how? (frame pointers? at least on the IO
> >>>> xlator stack) What is glusterd doing for the same?
> >>> As of now we have started using logrus pkg in golang which logs every
> >>> message with its txn id. How it does is what I need to find out looking
> >>> at
> >>> the pkg code (if you are looking to borrow the similar idea)
> >>>
> >>
> >> The logrus[1] package provides structured logging and ability to build
> >> incremental log contexts. This makes it possible to do things like
> >> message-ids. But IMO the advantage of using structured logging is that
> >> it would kind of remove the need for message-ids.
> >>
> >> In structured logging, it encouraged to keep the log string itself
> >> constant and keep the variable portions as metadata attached to the
> >> log, generally as key-value pairs. This allows us to implement things
> >> like message-ids, but it is not really required. As the log string is
> >> kept constant, it becomes easier to search for individual logs and
> >> document them. For example (not really the best example), currently we
> >> log socket failures in the following format
> >>
> >> "readv on some.socket failed (No data available)"
> >>
> >> This is hard to search for or parse, as the log string itself changes
> >> when the socket or error changes. If using structured logging, this
> >> log would be in the format
> >>
> >> "readv on socket failed socket=some.socket errno=ENODATA"
> >>
> >> This is much more easier to search and parse.
> >>
> >> Using log-contexts allows attaching a certain metadata to all logs in
> >> a context, for example, attaching transaction-ids to all logs of a
> >> transaction. If we were to use log-contexts above example we could
> >> attach the transaction-id to the log,
> >>
> >> "readv on socket failed socket=some.socket errno=ENODATA
> >> txn-id=<uuid-string-here>"
> >>
> >> allowing us to easily track during which transaction did the read fail.
> >>
> >
> > (Missed the link)
> > [1]: https://github.com/sirupsen/logrus
> 
> The intention of message IDs was to prevent any confusion arising due to
> message string changes. I.e when the message string changes between
> releases, mapping this back to the original trouble shot (metadata)
> provided requires manual intervention, that the message ID takes away.
> 
> Now, mandating the message string a constant is another way to generate
> the same. Actually initially the message ID for the catalog was to be
> generated using hashes of message strings, but we were not able to
> assure that the message strings would not change and found duplicate
> message strings (quite a few, I had metrics on this at some point but
> not now).
> 
> Also, in gluster there are strings with the same message, but at times
> providing a different context for the message, in these situations again
> the message ID helps.
> 
> IMHO the message ID is invariant, the message is not (as we better it
> from time to time) and hence a catalog of the same is very easy to
> maintain and update.
> 
> As a side note, the addition of errno to the log API was also to prevent
> the example of what is seen above (the socket error), i.e message string
> is constant, and errno just defines the actual error.
> 
> Further, I agree with the <key,value> approach, the intention of masking
> the API (in this case gf_msg) as a series of inargs and not key value
> pairs, was to internally generate any KV pairs as needed. For example
> when doing centralized logging, addition of host details and deamon
> details as KV pairs (say to lumberjack, which I think is JSON format
> log) is possible, and is handled by the framework and not by each
> individual call to the logging.
> 
> We can change the API to a KV pair, or make it take an additional KV
> pair list, which has additional contextual information (say transaction
> ID/#) which is transparent to the logging framework, as it does not
> bother with the same.
> 
> >
> >>>>
> >>>>
> >>>>>>
> >>>>>>> 2. We often encounter issues, and then find ourselves wishing the
> >>>>>>> process was running in debug mode so that we could have gotten some
> >>>>>>> debug logs. If there is any solution that enables me tracability of
> >>>>>>> the process's flow, without compromising the space constraints of the
> >>>>>>> logs that would be great to have.
> >>>>>>
> >>>>>> I think after each debugging session, we should revise the log
> >>>>>> messages
> >>>>>> to get following answers.
> >>>>>> 1. Is something missing in log, which could have reduced the time
> >>>>>> spent
> >>>>>> to root cause.
> >>>>>> 2. Logging is available but imcomplete details. (For example, on
> >>>>>> volume
> >>>>>> set, debug message will have key and value but no Volume name)
> >>>>>> 3. Is something redundant or not required. (For example,
> >>>>>> Geo-replication
> >>>>>> worker logs sync engine as rsync or tar mode. If we have multiple
> >>>>>> workers then all workers prints this log message. Instead if we move
> >>>>>> this log to monitor process instead of worker process we could save
> >>>>>> some
> >>>>>> space in log files)
> >>>>>>
> >>>>>> With this practice, we can improve our logs over time.
> >>>>>
> >>>>> I do agree here with Aravinda, we can do a far better job even at INFO
> >>>>> log level than what we've been doing now.
> >>>>>>>
> >>>>>>>
> >>>>>>> This is kind of an open floor question I am putting across to the
> >>>>>>> community, with no specific solution in mind at this point in time,
> >>>>>>> but a concern that we should rather decide on something sooner in the
> >>>>>>> development cycle than later.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Avra
> >>>>>>> _______________________________________________
> >>>>>>> Gluster-devel mailing list
> >>>>>>> Gluster-devel at gluster.org
> >>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Gluster-devel mailing list
> >>>>>> Gluster-devel at gluster.org
> >>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
> >>>>>
> >>>>> _______________________________________________
> >>>>> Gluster-devel mailing list
> >>>>> Gluster-devel at gluster.org
> >>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
> >>>>>
> >>>> _______________________________________________
> >>>> Gluster-devel mailing list
> >>>> Gluster-devel at gluster.org
> >>>> http://www.gluster.org/mailman/listinfo/gluster-devel
> >>>
> >>> -Atin
> >>> Sent from one plus one
> >>>
> >>>
> >>> _______________________________________________
> >>> Gluster-devel mailing list
> >>> Gluster-devel at gluster.org
> >>> http://www.gluster.org/mailman/listinfo/gluster-devel
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>