[Gluster-devel] Logging in a multi-brick daemon

Thu Feb 16 14:36:13 UTC 2017

On 02/16/2017 05:27 AM, Rajesh Joseph wrote:
> On Thu, Feb 16, 2017 at 9:46 AM, Ravishankar N <ravishankar at redhat.com> wrote:
>> On 02/16/2017 04:09 AM, Jeff Darcy wrote:
>>>
>>> One of the issues that has come up with multiplexing is that all of the
>>> bricks in a process end up sharing a single log file.  The reaction from
>>> both of the people who have mentioned this is that we should find a way to
>>> give each brick its own log even when they're in the same process, and make
>>> sure gf_log etc. are able to direct messages to the correct one.  I can
>>> think of ways to do this, but it doesn't seem optimal to me.  It will
>>> certainly use up a lot of file descriptors.  I think it will use more
>>> memory.  And then there's the issue of whether this would really be better
>>> for debugging.  Often it's necessary to look at multiple brick logs while
>>> trying to diagnose this problem, so it's actually kind of handy to have them
>>> all in one file.  Which would you rather do?
>>>
>>> (a) Weave together entries in multiple logs, either via a script or in
>>> your head?
>>>
>>> (b) Split or filter entries in a single log, according to which brick
>>> they're from?
>>>
>>> To me, (b) seems like a much more tractable problem.  I'd say that what we
>>> need is not multiple logs, but *marking of entries* so that everything
>>> pertaining to one brick can easily be found.  One way to do this would be to
>>> modify volgen so that a brick ID (not name because that's a path and hence
>>> too long) is appended/prepended to the name of every translator in the
>>> brick.  Grep for that brick ID, and voila!  You now have all log messages
>>> for that brick and no other.  A variant of this would be to leave the names
>>> alone and modify gf_log so that it adds the brick ID automagically (based on
>>> a thread-local variable similar to THIS).  Same effect, other than making
>>> translator names longer, so I'd kind of prefer this approach.  Before I
>>> start writing the code, does anybody else have any opinions, preferences, or
>>> alternatives I haven't mentioned yet?
>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>> My vote is for having separate log files per brick. Even in separate log
>> files
>> that we have today, I find it difficult to mentally ignore irrelevant
>> messages
>> in a single log file as I am sifting through it to look for errors that are
>> related to the problem at hand. Having entries from multiple bricks and then
>> grepping it would only make things harder. I cannot think of a case where
>> having
>> entries from all bricks in one file would particularly be beneficial for
>> debugging since what happens in one brick is independent of the other bricks
>> (at least until we move client xlators to server side and run them in the
>> brick process).
>> As for file descriptor count/memory usage, I think we should be okay
>> as it is not any worse than that in the non-multiplexed approach we have
>> today.
>>
>> On a side-note, I think the problem is not having too many log files but
>> having
>> them in multiple nodes. Having a log-aggregation solution where all messages
>> are
>> logged to a single machine (but still in separate files) would make it
>> easier to
>> monitor/debug issues.
>> -Ravi
>>
>
> I believe the logs are not just from one volume but from all. In that
> case merging them
> into a single log file may not be great for debugging. Especially in
> container use cases
> there can be multiple volumes. Yes, with some tagging and scripting we
> can separate
> the logs and still live with it.

In container world, *I believe* centralized logging (using something 
like an ELK/EFK stack) would be the way to go, than collecting logs from 
each gluster (or application mount) container/node. In these situations 
we are going to get logs from different volumes anyway, or at best a 
filtered list from whichever stack is used for the centralized logging.

So, I would think (as being described) we would need to have enough 
identifiers in the log message, such that we can filter appropriately 
and that should take care of the debugging concern.

Of course building these scripts out from the beginning and possibly 
even shipping them with our RPMs may help a great deal, than having to 
roll one out when we get into troubleshooting or debugging a setup.

>
> What about the log levels? Each volume can configure different log
> levels. Will you carve
> out a separate process in case log levels are changed for a volume?
> How is this handled
> here?
>
> -Rajesh
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>