[Gluster-users] [Gluster-devel] GlusterFS and the logging framework
Sahina Bose
sabose at redhat.com
Tue May 13 11:30:39 UTC 2014
On 05/02/2014 09:48 AM, Nithya Balachandran wrote:
> Thanks Joe. Assuming you like approach#2, please let us know of anything else that you would find helpful in the gluster logs.
I may be jumping in a little late here...but is it possible for the
messages to follow a dictionary format.
So for example "Usage is above soft limit: 30.0GB used by /test/ ".
If the message was in a format:{"msgid":xxx,"msg": "Usage is above soft
limit: 300.0KB used by /test/ ","volume": "test-vol", "dir":"/test"}
This helps the applications that parse the logs to identify affected
enitities. Otherwise we need to resort to pattern matching which is
kinda flaky. (currently we monitor logs for a nagios monitoring plugin).
Is this doable?
>
> Thanks,
> Nithya
>
> ----- Original Message -----
> From: "Joe Julian" <joe at julianfamily.org>
> To: "Nithya Balachandran" <nbalacha at redhat.com>, gluster-devel at gluster.org
> Cc: "gluster-users" <gluster-users at gluster.org>
> Sent: Wednesday, 30 April, 2014 7:04:14 PM
> Subject: Re: [Gluster-devel] GlusterFS and the logging framework
>
> From my perspective as someone who supports GlusterFS on IRC and frequently has to read logs in order to determine what went wrong, I like this proposal.
>
>
>
> On April 30, 2014 12:06:26 AM PDT, Nithya Balachandran <nbalacha at redhat.com> wrote:
>> Hi,
>>
>> I have attached some DHT files to demonstrate the 2 logging approaches.
>> (*_1 is the original approach, *_2 is the proposed approach).I
>> personally think the 2 approach leads to better code readability and
>> propose that we follow approach 2. Please let me know of any concerns
>> with this.
>>
>>
>> To consolidate all the points raised in the earlier discussions:
>>
>>
>> What are we trying to solve?
>> Improving gluster logs to make end user debugging easier by providing a
>> sufficient information and a consistent logging mechanism and message
>> format .
>>
>> The new logging framework already logs the function name and line,
>> msgid and strerror, which improves the log messages and debug-ability.
>> However, there are some potential issues with the way it is getting
>> used. Please note - there are no changes being proposed to the
>> underlying logging framework.
>>
>>
>> Current approach (approach 1):
>>
>> Define message_ids for each log message (except Trace and Debug) and
>> associate both id and string with a msg_id macro
>> Replace all calls to gf_log with gf_msg passing in the message_id for
>> the message. This message_id will be printed as part of the log
>> message.
>> Document each log string with details of what caused it/how to fix it.
>>
>>
>>
>> Issues:
>> 1. Code readability - It becomes difficult to figure out what the
>> following is actually printing and can cause issues with incorrect
>> params being passed or params being passed in the wrong order:
>> gf_msg ("dht", GF_LOG_ERROR, 0, dht_msg_23, param1, param2, param3);
>>
>> 2.Code Redundancy -multiple messages for the same thing differing in
>> small details can potentially use up a large chunk of allocated ids as
>> well as making it difficult for end users - they will need to search
>> for multiple string formats/msgids as they could all refer to more or
>> less the same thing. For example:
>>
>> dht_msg_1 123, "Failed to get cached subvol for %s"
>> dht_msg_2 124, "Failed to get cached subvol for %s on %s"
>>
>>
>>
>> 3. Documentation redundancy -
>>
>> The proposed format for documenting these messages is as follows:
>>
>> Msg ID
>> Message format string
>> Cause
>> Recommended action
>>
>> This could potentially lead to documentation like:
>>
>> Msg ID : 123
>> Message format string : Failed to get cached subvol for <path>
>> Cause : The subvolume might not be reachable etc etc
>> Recommended action : Check network connection etc etc
>>
>> Msg ID : 124
>> Message format string : Failed to get cached subvol for <path> on
>> <path2>
>> Cause : The subvolume might not be reachable etc etc
>> Recommended action : Check network connection etc etc
>>
>> The end user now has to search for multiple msgids and string formats
>> to find all instances of this error.
>>
>> NOTE: It may be possible to consolidate all these strings into a single
>> one, say, "Failed to get cached subvol for %s on %s" and mandate that
>> it be used in all calls which are currently using variations of the
>> string. However, this might not be possible in all scenarios - some
>> params might not be available or might not be meaningful in a
>> particular case or a developer might want to provide additional info in
>> a particular scenario.
>>
>>
>>
>> Proposed approach (approach 2):
>> Define meaningful macros for message_ids for a class of message (except
>> Trace and Debug) without associating them to a message string. For
>> example
>> #define DHT_CACHED_SUBVOL_GET_FAILED 123
>> #define DHT_MEM_ALLOC_FAILED 124
>>
>>
>> Replace all calls to gf_log with gf_msg but pass in the msg id and
>> string separately. The string is defined by the developer based on an
>> agreed upon format.
>>
>> Define a log message format policy that all developers need to follow.
>> This will need to be enforced by reviews. For example, we could mandate
>> that all log messages must start with the name of the file on which the
>> operation is performed and end with the strerror if it exists.This can
>> also include rules as to sentence structure and wording - whether to
>> use "failed", "unable to", "could not" etc.
>>
>> Consolidate existing messages and reword them if necessary to make them
>> more meaningful. If a single message will work in multiple instances,
>> use that one everywhere.
>>
>> Add your documentation writer as a reviewer for all patches. S/he will
>> be responsible for ensuring that all newly introduced log messages are
>> meaningful, consistent and follow the agreed upon format.
>>
>> Devs will define new message classes ids as and when required.
>>
>> Ideally, common message classes like dict-set-failed or
>> memory-alloc-failed should be defined in a common file and included by
>> others - no point having each component define a memory_alloc_failed
>> id.
>>
>> With the proposed approach:
>>
>> #define DHT_CACHED_SUBVOL_GET_FAILED 123
>> #define DHT_HASHED_SUBVOL_GET_FAILED 124
>>
>> Calls would then look like:
>>
>> gf_msg ("dht", GF_LOG_ERROR, DHT_CACHED_SUBVOL_GET_FAILED, "Failed to
>> get cached subvolume for path %s", param1);
>> ...
>> gf_msg ("dht", GF_LOG_ERROR, DHT_CACHED_SUBVOL_GET_FAILED, "Failed to
>> get cached subvolume for path %s on %s", param1, param2);
>>
>>
>> Documentation would be as follows:
>>
>> Msg ID : 123
>> Description : Failed to get the cached subvolume for the specified path
>> Cause : The subvolume might not be reachable etc etc
>> Recommended action : Check network connection etc etc
>>
>> Admins could just search for [MSGID 123] and find all instances of
>> where an operation failed to get a cached sub volume.
>>
>> Issues raised with proposed approach:
>>
>> 1. Internationalization: Having the strings in a single file is
>> required to make L10N easy.
>> While i18n is very important for user tools, utils etc, log messages
>> are usually targeted at developers and sys admins who usually know
>> English. Plus it seems unlikely that log messages will be localized in
>> the near future. However, the document describing the msgid can be
>> localized so the msg id mapping information can still be used.
>>
>> 2. Having the strings in a header file makes it easier to change the
>> format later.
>> This is a valid point. However, IMHO, the code readability is more
>> important especially in the case where we pass arguments to the format
>> string.
>>
>> 3. Having a string defined in a single header file can make it easier
>> for a dev to reuse it if necessary
>> I would suggest searching on the message id instead and copy the string
> >from elsewhere if it suits his purpose as those will already have been
>> reviewed by the doc writer.
>>
>>
>> Regards,
>> Nithya
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
More information about the Gluster-users
mailing list