Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing log labels: instance_id, zone #493

Closed
b3nk3nobi opened this issue Mar 24, 2022 · 6 comments
Closed

Missing log labels: instance_id, zone #493

b3nk3nobi opened this issue Mar 24, 2022 · 6 comments
Labels

Comments

@b3nk3nobi
Copy link

In the Log Explorer there is no information about instance_id or zone:
vivaldi__2022_03_24_08-53-05

I cannot filter logs by instance:
vivaldi__2022_03_24_09-38-13

VMs don't have any service account assigned and it cannot be changed because they are in production (can't stop them), so I'm using file with SA private key located in /etc/google/auth/application_default_credentials.json.

Using curl to get these values on the actual VM is working fine:
WindowsTerminal__2022_03_24_09-00-20

  • logging-module.log:
[2022/03/24 07:15:34] [ info] [engine] started (pid=192594)
[2022/03/24 07:15:34] [ info] [storage] version=1.1.5, initializing...
[2022/03/24 07:15:34] [ info] [storage] root path '/var/lib/google-cloud-ops-agent/fluent-bit/buffers'
[2022/03/24 07:15:34] [ info] [storage] normal synchronization mode, checksum enabled, max_chunks_up=128
[2022/03/24 07:15:34] [ info] [storage] backlog input plugin: storage_backlog.3
[2022/03/24 07:15:34] [ info] [cmetrics] version=0.2.2
[2022/03/24 07:15:34] [ info] [input:storage_backlog:storage_backlog.3] queue memory limit: 47.7M
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.0] metadata_server set to http://metadata.google.internal
[2022/03/24 07:15:34] [ info] [oauth2] HTTP Status=200
[2022/03/24 07:15:34] [ info] [oauth2] access token from 'www.googleapis.com:443' retrieved
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.1] metadata_server set to http://metadata.google.internal
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.0] worker #6 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.0] worker #7 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.0] worker #5 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.0] worker #4 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.0] worker #3 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.0] worker #2 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.0] worker #1 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.0] worker #0 started
[2022/03/24 07:15:34] [ info] [oauth2] HTTP Status=200
[2022/03/24 07:15:34] [ info] [oauth2] access token from 'www.googleapis.com:443' retrieved
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.1] worker #6 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.1] worker #7 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.1] worker #5 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.1] worker #4 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.1] worker #3 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.1] worker #2 started
[2022/03/24 07:15:34] [ info] [output:prometheus_exporter:prometheus_exporter.2] listening iface=0.0.0.0 tcp_port=20202
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.1] worker #1 started
[2022/03/24 07:15:34] [ info] [output:stackdriver:stackdriver.1] worker #0 started
[2022/03/24 07:15:34] [ info] [input:tail:tail.1] inotify_fs_add(): inode=2592 watch_fd=1 name=/var/log/syslog
[2022/03/24 07:15:34] [ info] [input:tail:tail.2] inotify_fs_add(): inode=534597 watch_fd=1 name=/var/log/google-cloud-ops-agent/subagents/logging-module.log
[2022/03/24 08:12:16] [ info] [oauth2] HTTP Status=200
[2022/03/24 08:12:16] [ info] [oauth2] access token from 'www.googleapis.com:443' retrieved
[2022/03/24 08:12:17] [ info] [oauth2] HTTP Status=200
[2022/03/24 08:12:17] [ info] [oauth2] access token from 'www.googleapis.com:443' retrieved
  • version: google-cloud-ops-agent 2.12.0~ubuntu20.04

Am I missing something or is it working as it should be?

@sophieyfang
Copy link
Contributor

@b3nk3nobi
Copy link
Author

Yes, I have followed instructions. Without it there is a lot of errors about authorization in logging-module.log

@logicbomb421
Copy link

logicbomb421 commented Aug 26, 2022

I am also facing this issue.

I am in a similar situation as @b3nk3nobi where we don't run VMs with attached service accounts, and can't change that. I have followed the instructions in the authenticating the agent docs by creating a service account key, placing it in a secure location, and then setting GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account/key.json. The logging service logs show successful oauth2 communication, and logs do make it to Stackdriver, so I know the key is being picked up.

I am also able to receive the instance ID and zone via the metadata.google.internal HTTP call on the VM in question.

Without the resource.labels.instance_id field correctly set, logs don't properly associate with their originating VM, causing issues when creating dashboards, searching logs, etc.

Please let me know if more detail is needed. Thanks!


Update

Got clever and thought I'd found a workaround, but it didn't work. Unsure if this is related, or should be a separate issue (please let me know).

Attempting to use the modify_fields processor to set a static value here:

processors:
  fix_missing_labels:
    type: modify_fields
    fields:
      resource.labels.instance_id:
        static_value: my-instance-name

...results in the following error:

The agent config file is not valid. Detailed error: [18:36] "fields[resource.labels.instance_id]": 1:28: error: field "resource.labels.instance_id" not found

My understanding based on the docs is that the destination field (resource.labels.instance_id) needs to conform to the LogEntry object spec, which it does as far as I can tell.

I did notice in the docs that the exclude_logs processor says only httpRequest, jsonPayload, labels, operation, severity, and sourceLocation can be accessed, perhaps this is what's going in for modify_fields as well?

If so, I would suggest updating the documentation to include this information under that section as well.

@hsmatulisgoogle
Copy link
Contributor

These are missing due to how fluent bit operates when it is not authenticated through the metadata server, since it will only auto populate these fields through the metadata_server_auth authentication.

If you just want to be able to identify VMs, as a work-around, versions after 2.15.0 include #544 which auto populates the resource_name label

Replying to @logicbomb421 's attempted work-around:

  • It is true that the exclude_logs and modify_fields processors user the same verifying logic
  • That logic will map these fields to the underlying fluent bit labels with the corresponding meaning (these are custom to fluent bit)
    • In this case logging.googleapis.com/monitored_resource is the underlying fluent bit key. Fluent bit will check if a log entry has this key, and if so attempt to use it as a monitored resource spec
    • There is an internal mapping of these that effective behaves as a white list, which does not map the monitored_resource key:
      var logEntryRootStructMapToFluentBit = map[string]string{
      "labels": "logging.googleapis.com/labels",
      "operation": "logging.googleapis.com/operation",
      "sourceLocation": "logging.googleapis.com/sourceLocation",
      // TODO: This needs to be the same as confgenerator.HttpRequestKey. Importing
      // that package here results in a circular import. That should move somewhere
      // better, and once it does we can use that here.
      "httpRequest": "logging.googleapis.com/httpRequest",
      }

Copy link

github-actions bot commented Jan 7, 2025

This issue was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jan 7, 2025
Copy link

Closed as inactive. Feel free to reopen if this issue is still relevant.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants