-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support canonical log lines & log level per request #583
Comments
I wonder of that could be implemented at a writer level where you create one logger per request with a pre-allocated (pooled) buffer writer. At the end of the request, if the request was an error, you flush the buffer (which writes to the destination writer), if not, you reset it then put it back to its pool. We could provide such implementation or just reference another package that implements this logic. |
I think the per-request logger/writer approach can work but the tricky thing is the logic for "At the end of the request, if the request was an error" - if you do not have to inspect what has been logged during the request to make this determination then this works. But if you do have to inspect (e.g., you want to flush the buffer if there was any log message at error level or above) what has been logged, then you have to parse what has been written. |
Make it implement |
You are right. Also I realized that I can easily convert log lines written to the writer into an JSON array using Thanks! I can implement this. Would you consider having a writer (or two) to support this added to this package? I can make a PR. But given the amount of open issues and PRs maybe it would be too much work for you reviewing it? (BTW, I could help do one pass over issues and PRs if you want. I do not have time to help long-term, but I went through many issues recently trying to better understand this package and found some which could be closed or commented upon. Similarly, I could leave some comments on few PRs. I would not merge anything, I might suggest to you do so though.) |
Why is it needed to put those messages in an array? Wouldn't a request id be sufficient to group them? |
I mean, the whole idea of canonical log lines is that you have one log line per HTTP request handling. So any debug logging during that handling should be nested inside that line. The idea is that you pre-group everything into one log message. It is kinda like optimization for the common case. My concrete proposal to implement for the zerolog package itself would be to implement the following writer: type BufferedLine struct {
Level Level
Line []byte
}
type BufferedLevelWriter struct {
lines []BufferedLine
levelCounts map[Level]int
}
// FlushTo Writes all buffered lines to writer and clears the buffer.
func (w *BufferedLevelWriter) FlushTo(writer LevelWriter) (int, error)
// LevelCount returns the count of lines at level in the buffer.
func (w *BufferedLevelWriter) LevelCount(level Level) int
// FlushLines returns all buffered lines and clears the buffer.
func (w *BufferedLevelWriter) FlushLines() []BufferedLine I think with such a writer one can then easily implement everything described above. Easily check for if any error logging has been made using |
The concept of a canonical log if I understand it is that you get all the information of a request in a single line so it has all the info needed for a request. The argument is that it's easier to query and aggregate. Merging multiple log lines into arrays and log those arrays of structured log lines is in my opinion going the opposite direction as arrays are way harder to query and aggregate in most logging systems I know. A canonical log would be way easier to implement by the caller. TBH, without knowing about the concept of canonical log lines, that is how I always logged requests: putting all info in a single line. I don't think this should be a logging library feature, but how you are using it. The log level per request idea though is interesting. Emitting some debug log in the request that are only written to the logger target if the canonical log line (or any log line really) emitted for the request is an error is an interesting concept. Here is an example of what I have in mind: type TriggerLevelWriter struct {
// Destination writer
io.Writer
// ConditionalLevel is the level at which lines are buffered until a trigger
// level line is emitted. Usually this is set to DebugLevel.
ConditionalLevel Level
// TriggerLevel is the level that triggers the sending of the conditional
// level lines. Usually this is set to ErrorLevel.
TriggerLevel Level
// TODO: handle thread safety
buf bytes.Buffer
triggered bool
}
func (w *TriggerLevelWriter) WriteLevel(l Level, p []byte) (n int, err error) {
// Unless triggered, we buffer everything below ConditionalLevel.
if l <= w.ConditionalLevel && !w.triggered {
w.buf.Write(p)
return len(p), nil
}
// At first trigger level log line, we flush the buffer and change the
// trigger state.
if l == w.TriggerLevel && !w.triggered {
_, err = w.Write(w.buf.Bytes())
w.buf.Reset()
w.triggered = true
if err != nil {
return 0, err
}
}
// Anything above ConditionalLevel is always passed through.
// Once triggered, everything is passed through.
return w.Write(p)
} |
I reread again everything I could find about canonical log lines and I think you are right. They do not mention that sub-loggers should nested inside the canonical log line. So I think this means that the pattern to do here is to add to request context:
Now the question is only which messages from that sub-logger we really store. The idea behind the "log level per request" is that this depends on how the request was handled/were there any errors in handling it or any errors logged. I think this can be done in the same middleware spot after request handling which writes out the canonical log line that it also decides to write out buffered log lines or discard them. Your
|
We could by good measure, but
It should stay a
Agreed |
OK. I can make a stab at this and make a PR. |
Note there is an error in my code, when flushing the buffer, we need to call one |
Yea, there are more errors. :-) Like returning 0 for written bytes when there is an error in the trigger case. In general I think we will have to store lines and levels, line by line, so then it is easy to loop over. |
I think it's better to store them in a (pooled) buffer so we can keep the zero allocation promise and minimal code complexity. In the case the writer is trigger (which should be the exception), we can easily read lines back from the buffer. The level can be stored as the first byte of each line and trimmed on write. |
So I am not so familiar with preventing allocations in Go, so please correct me, but my understanding is that the main allocation here is the line itself. So when So my proposal would be to simply have |
I realized this makes no sense. |
I just made PR with what we discussed above: #602 |
Canonical log lines
There are generally two ways to organize log lines which are all related to a particular request:
In a way the this is standard question of normalized vs. denormalized data.
There are some existing issues I could find which seems to talk about support for canonical log lines in zerolog: #362 and #408, but I think it is worth investigating if zerolog should&could support canonical log lines out of the box.
The support for request ID is easy with sub-loggers. And I think it could be the same for canonical log lines. In fact, it could also be organized as sub-loggers, maybe only with some additional
.Nested()
or some similar call, which would turn on canonical log lines feature. Then, when a sub-logger would be called to send the event, instead of writing it out to the writer, the log line would accumulate inside parent's logger array field associated with the level. Only when the parent logger then issues the event, the log line would be printed out. So the idea here is:.Nested()
or something)..Send()
or something, maybe even adding more metadata (like amount of data transferred inside the request handling, etc.).The structure could be something like:
And so on for different levels.
Log level per request
An interesting thing with canonical log lines is also how level filtering works. I propose that the canonical log line gets a level of the most severe event level logged into it. For example, if both error and warn levels were logged for the same request, then the canonical log level gets the error level.
With a canonical log lines, the sub-logger could log at a very detailed logging level (e.g., debug) but the whole line could be issues only if there was an error logged as well by setting minimal log level to error. This would allow developers to have debugging information together with all errors, but only with errors.
Such type of logging is often called "log level per request". To my understanding zerolog does not currently support that either. This could be supported for both canonical log lines and for regular log lines. This is where we can see that this is not just a question of de/normalization of log lines, but about processing of log events in batches.
Batching log events
Both canonical log lines and log level per request show a need for some "batching of log events" feature in zerolog. Where log events would be batched together until some later event decides to write them out or not (potentially modifying them). In the case of canonical log lines, batching would batch all log events for a sub-logger and then write them out as one log event. In the case of log level per request, after batch finishes, logic would decide to write it out or not.
(I am not sure if "batching" is the right term here. Maybe "batch processing" or something?)
Alternatives
Alternatives to both are possible by post-processing logs:
The text was updated successfully, but these errors were encountered: