[Gluster-devel] Logging in a multi-brick daemon

Thu Feb 16 13:21:42 UTC 2017

> Debugging will involve getting far more/bigger files from customers
> unless we have a script (?) to grep out only those messages pertaining
> to the volume in question. IIUC, this would just be grepping for the
> volname and then determining which brick each message pertains to
> based on the brick id, correct?

Correct.  There would also be some possibly-interesting messages that
aren't specifically tied to any one brick, e.g. in protocol/server or
various parts of libglusterfs, so we'd probably always want those no
matter what brick(s) we're interested in.

> Would brick ids remain constant across add/remove brick operations? An
> easy way would probably be just to use the client xlator number as the
> brick id which would make it easy to map the brick to client
> connection.

Brick IDs should be constant across add/remove operations, which I
suppose means they'll need to be more globally unique than they are now
(client translator indices can clash across volumes).

> With several brick processes all writing to the same log file, can
> there be problems with interleaving messages?

AFAICT the "atom" of logging is a line, so there shouldn't be problems
of interleaving within a line ("foo" + "bar" won't become "fboaro").
However, when code tries to log multiple lines together - e.g. DHT
layouts or AFR self-heal info - that could end up being interleaved with
another brick doing the same.  They'd still be distinct according to
brick ID, but when looking at an unfiltered log it could look a bit
confusing.

> Logrotate might kick in faster as well causing us to lose debugging
> data if only a limited number of files are saved, as all those files
> would now hold less log data per volume. The logrotate config options
> would need to be changed to keep more files.

True.

> Having all messages for the bricks of the same volume in a single file
> would definitely be helpful. Still thinking through logging all
> messages for all bricks in a single file. :)

Something to keep in mind is that multiplexing everything into only one
process is only temporary.  Very soon, we'll be multiplexing into
multiple processes, with the number of processes proportional to the
number of cores in the system.  So, for a node with 200 bricks and 24
cores, we might have 24 processes each containing ~8 bricks.  In that
case, it would make sense to keep bricks for the same volume separate as
much as possible.  A process is a failure domain, and having multiple
related bricks in the same failure domain is undesirable (though
unavoidable in some cases).  The consequence for logging is that you'd
still have to look at multiple files.