-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
20231018-access-logs: design doc #11
base: main
Are you sure you want to change the base?
Conversation
|
||
#### Scraping and annotating access logs | ||
|
||
Three services generate access logs: Linksharing, Gateway-MT, and the Satellite API Pods. Each of these services has multiple instances. A logging agent will run next to each instance and scrape the generated access logs. The logging agent may filter the log entries or re-arrange their log format, but its main task is to annotate the log entries with the Tenant ID and push them to the Grafana Loki destination. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we end up with "duplicate" logs for linksharing and gateway requests? In other words, wont there be a satellite log for every linksharing/gateway request -- or is it somehow combined based on request id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be a different stream (or "label" in Grafana terms) of logs for Linksharing, Gateway-MT, and Satellite API. There should be a correlating log entry in the satellite stream with a matching request ID for each request in the edge services. However, this may not be the case in the future if we introduce edge caching.
|
||
Storj will operate the Loki server in write-only mode (with `-target=write,compactor` flag), i.e. it won't allow querying the logs. We chose the write-only mode for easier operation. | ||
|
||
Customers will query their access logs not through the Storj Loki server but with Loki client tooling ([LogCLI](https://grafana.com/docs/grafana-cloud/monitor-infrastructure/logs/export/query-exported-logs/#querying-the-archive-using-logcli) or [read-only Loki server](https://grafana.com/docs/grafana-cloud/monitor-infrastructure/logs/export/query-exported-logs/#query-the-archive-using-loki-in-read-only-mode)) configured directly to their target Storj bucket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like an unnecessary step to force Loki client tooling, rather than dumping more readily readable file formats... what would change if we needed to give non-proprietary log format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Loki format is open source and described in their docs: https://grafana.com/docs/loki/latest/operations/storage/#chunk-format.
The raw log chunks are compressed and have additional binary metadata. They can be converted to readable text format with the Loki chunk-inspect tool: https://github.com/grafana/loki/tree/main/cmd/chunks-inspect.
Then, users can do whatever they want with them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's always nice to have human readable logs without having to learn a new tool. I think people are going to want text logs readily available but maybe that's worth testing in a beta to see if anybody cares.
|
||
### Open question | ||
|
||
- Should we log linksharing requests beyond those to raw content like listing buckets and prefixes, displaying the object map, etc.? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be useful, maybe we could make it clear where these are coming from with the user agent.
|
||
#### Configuring a bucket for access logs | ||
|
||
The customer will be able to turn on access logs per bucket. By default, a bucket does not generate access logs. The customer may decide later to turn off the access logs for the bucket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which part of the stack knows which buckets have logging enabled? How does that information make it down to the Loki server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is yet to be decided. It could be another column in the satellite's bucket_metainfo
table, or a separate registry.
For the MVP, we can manually configure the respective components:
- Linksharing: the list of project-bucket pairs to generate access logs for.
- The Loki distribution job: the S3 credentials to the target customer bucket for each Tenant ID.
When we have some experience with the MVP, we'll know best how to improve the config and communicate it across the stack.
I updated the doc and added an MVP section at the end. |
No description provided.