[Gluster-devel] Logging framework in Gluter 4.0

Tue Nov 10 14:56:32 UTC 2015

On 11/10/2015 01:36 AM, Kaushal M wrote:
> On Tue, Nov 10, 2015 at 12:05 PM, Kaushal M <kshlmster at gmail.com> wrote:
>> On Fri, Nov 6, 2015 at 10:45 PM, Atin Mukherjee
>> <atin.mukherjee83 at gmail.com> wrote:
>>> -Atin
>>> Sent from one plus one
>>>
>>>
>>> On Nov 6, 2015 7:50 PM, "Shyam" <srangana at redhat.com> wrote:
>>>>
>>>> On 11/06/2015 06:58 AM, Atin Mukherjee wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 11/06/2015 01:30 PM, Aravinda wrote:
>>>>>>
>>>>>>
>>>>>> regards
>>>>>> Aravinda
>>>>>> http://aravindavk.in
>>>>>>
>>>>>> On 11/06/2015 12:28 PM, Avra Sengupta wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> As almost all the components targeted for Gluster 4.0 have moved from
>>>>>>> design phase to implementation phase on some level or another, I feel
>>>>>>> it's time to get some consensus on the logging framework we are going
>>>>>>> to use. Are we going to stick with the message ID formatted logging
>>>>>>> framework in use today, or shall we move on to a better solution.
>>>>
>>>>
>>>> I would prefer to think that we extend the log messages with what we want.
>>>> Message IDs are there for a purpose as outlined in earlier mails on the same
>>>> topic, so we [w|s]hould stick with message IDs and add more is how I think
>>>> of the problem.
>>>>
>>>> If we take inspiration from other frameworks, I would say we improve what
>>>> we log.
>>>>
>>>> Where we log is made pluggable in the current framework, although as we
>>>> add more logging frameworks (like rsyslog or lumberjack etc.) the
>>>> abstraction provided for plugging in these could improve. This is a
>>>> contained change in the logging.c file though.
>>>>
>>>>
>>>>>>>
>>>>>>> Some good to have features I would like to have in the logging
>>>>>>> framework are:
>>>>>>> 1. Specific distinction between logs of various transactions (or
>>>>>>> operations), that would make it a lot easier to make sense than
>>>>>>> looking at some serialized log dump
>>>>>>
>>>>>> Externally grouping Message IDs may help in this situation.
>>>>>
>>>>> Transaction Id is definitely a *must have* requirement when 4.0's theme
>>>>> is scalability where we are talking about number of servers been
>>>>> thousands in the configuration with out which analysis of an issue will
>>>>> be a nightmare. GlusterD 2.0 is definitely going to have logs with txn
>>>>> ids.
>>>>
>>>>
>>>> +1
>>>>
>>>> I thought this was discussed in another thread on the lists, not able to
>>>> find it, a link would help.
>>>>
>>>> The question is, should this be automated by the logging framework, or
>>>> something each xlator should do?
>>>>
>>>> I prefer automating this, but how? (frame pointers? at least on the IO
>>>> xlator stack) What is glusterd doing for the same?
>>> As of now we have started using logrus pkg in golang which logs every
>>> message with its txn id. How it does is what I need to find out looking at
>>> the pkg code (if you are looking to borrow the similar idea)
>>>
>>
>> The logrus[1] package provides structured logging and ability to build
>> incremental log contexts. This makes it possible to do things like
>> message-ids. But IMO the advantage of using structured logging is that
>> it would kind of remove the need for message-ids.
>>
>> In structured logging, it encouraged to keep the log string itself
>> constant and keep the variable portions as metadata attached to the
>> log, generally as key-value pairs. This allows us to implement things
>> like message-ids, but it is not really required. As the log string is
>> kept constant, it becomes easier to search for individual logs and
>> document them. For example (not really the best example), currently we
>> log socket failures in the following format
>>
>> "readv on some.socket failed (No data available)"
>>
>> This is hard to search for or parse, as the log string itself changes
>> when the socket or error changes. If using structured logging, this
>> log would be in the format
>>
>> "readv on socket failed socket=some.socket errno=ENODATA"
>>
>> This is much more easier to search and parse.
>>
>> Using log-contexts allows attaching a certain metadata to all logs in
>> a context, for example, attaching transaction-ids to all logs of a
>> transaction. If we were to use log-contexts above example we could
>> attach the transaction-id to the log,
>>
>> "readv on socket failed socket=some.socket errno=ENODATA
>> txn-id=<uuid-string-here>"
>>
>> allowing us to easily track during which transaction did the read fail.
>>
>
> (Missed the link)
> [1]: https://github.com/sirupsen/logrus

The intention of message IDs was to prevent any confusion arising due to 
message string changes. I.e when the message string changes between 
releases, mapping this back to the original trouble shot (metadata) 
provided requires manual intervention, that the message ID takes away.

Now, mandating the message string a constant is another way to generate 
the same. Actually initially the message ID for the catalog was to be 
generated using hashes of message strings, but we were not able to 
assure that the message strings would not change and found duplicate 
message strings (quite a few, I had metrics on this at some point but 
not now).

Also, in gluster there are strings with the same message, but at times 
providing a different context for the message, in these situations again 
the message ID helps.

IMHO the message ID is invariant, the message is not (as we better it 
from time to time) and hence a catalog of the same is very easy to 
maintain and update.

As a side note, the addition of errno to the log API was also to prevent 
the example of what is seen above (the socket error), i.e message string 
is constant, and errno just defines the actual error.

Further, I agree with the <key,value> approach, the intention of masking 
the API (in this case gf_msg) as a series of inargs and not key value 
pairs, was to internally generate any KV pairs as needed. For example 
when doing centralized logging, addition of host details and deamon 
details as KV pairs (say to lumberjack, which I think is JSON format 
log) is possible, and is handled by the framework and not by each 
individual call to the logging.

We can change the API to a KV pair, or make it take an additional KV 
pair list, which has additional contextual information (say transaction 
ID/#) which is transparent to the logging framework, as it does not 
bother with the same.

>
>>>>
>>>>
>>>>>>
>>>>>>> 2. We often encounter issues, and then find ourselves wishing the
>>>>>>> process was running in debug mode so that we could have gotten some
>>>>>>> debug logs. If there is any solution that enables me tracability of
>>>>>>> the process's flow, without compromising the space constraints of the
>>>>>>> logs that would be great to have.
>>>>>>
>>>>>> I think after each debugging session, we should revise the log messages
>>>>>> to get following answers.
>>>>>> 1. Is something missing in log, which could have reduced the time spent
>>>>>> to root cause.
>>>>>> 2. Logging is available but imcomplete details. (For example, on volume
>>>>>> set, debug message will have key and value but no Volume name)
>>>>>> 3. Is something redundant or not required. (For example, Geo-replication
>>>>>> worker logs sync engine as rsync or tar mode. If we have multiple
>>>>>> workers then all workers prints this log message. Instead if we move
>>>>>> this log to monitor process instead of worker process we could save some
>>>>>> space in log files)
>>>>>>
>>>>>> With this practice, we can improve our logs over time.
>>>>>
>>>>> I do agree here with Aravinda, we can do a far better job even at INFO
>>>>> log level than what we've been doing now.
>>>>>>>
>>>>>>>
>>>>>>> This is kind of an open floor question I am putting across to the
>>>>>>> community, with no specific solution in mind at this point in time,
>>>>>>> but a concern that we should rather decide on something sooner in the
>>>>>>> development cycle than later.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Avra
>>>>>>> _______________________________________________
>>>>>>> Gluster-devel mailing list
>>>>>>> Gluster-devel at gluster.org
>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-devel mailing list
>>>>>> Gluster-devel at gluster.org
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-devel mailing list
>>>>> Gluster-devel at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>> -Atin
>>> Sent from one plus one
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel