Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support config.d style directories for config merging #265

Open
jeremyvisser opened this issue Nov 8, 2021 · 3 comments
Open

Support config.d style directories for config merging #265

jeremyvisser opened this issue Nov 8, 2021 · 3 comments

Comments

@jeremyvisser
Copy link

jeremyvisser commented Nov 8, 2021

My team is planning an Ops Agent rollout, where we will make use of the default system metrics and logging, including some custom metrics and custom logging rules, but also anticipate that downstream teams will their own custom metrics and logging rules.

This would require merging:

  • Default configuration
  • Platform team’s base configuration
    • Custom metrics (organisation–wide)
    • Custom logging (organisation–wide)
  • Service-specific configuration
    • Custom metrics (per–service)
    • Custom logging (per–service)

Since different teams would be responsible for different parts of the configuration, the preferred way would be to have separate configuration files which are merged at runtime.

Fluent Bit (which Ops Agent uses) supports an include syntax for configuration which suits this use case well.

But I don't see an equivalent for Ops Agent, which expects a single configuration file for all rules contained within it.

Ideally, Ops Agent would support a config.d style directory, where each file is merged structurally. I expect this to be functionally similar to how the default config is merged in confgenerator/confmerger.go, except support an arbitrary number of inputs.

Until that's implemented, we need to look at workarounds:

  • We use Puppet which has a concat module available, and multiple teams could have their configs concatenated, but YAML cannot safely be merged in a text–based fashion, due to differing whitespace.
  • We could implement our own YAML–native merging prior to deployment (Hiera has this natively, or we could write our own)
@quentinmit
Copy link
Member

Part of the reason we haven't implemented this yet is that it's unclear what the merge semantics should be. (e.g., if a processor with the same name is defined in two files, which definition(s) should be used? What if that processor name is referenced by different pipelines in different files?)

I'd love to hear your thoughts on that. What's currently implemented in confmerger.go is a very simplistic algorithm that works for the built-in case but likely wouldn't meet all the usecases of a config.d directory.

One of the options we've considered is to support multiple config files where each file is completely independent. What do you think of that?

@quentinmit
Copy link
Member

BTW, for your more immediate usecase, I'll mention that you can also use JSON for your config file, if it's easier to emit JSON with Puppet. (JSON is a valid subset of YAML.)

@jeremyvisser
Copy link
Author

jeremyvisser commented Nov 10, 2021

Yeah, the merge semantics are important to get right. Since you asked for my thoughts I’ve provided a view, but I acknowledge that this problem is indeed tricker than I first thought.

My biased opinion is that alphabetical order should be used for precedence, so that (e.g.) a numeric prefix can be used to guarantee ordering.

I think the config should be merged into a single internal structure before it starts trying to action the config, which would resolve the problem of a processor being referenced across multiple files, but merging itself is not trivial.

It would probably need to be a deep merge, but you raise valid points and me simply stating “deep merge” is likely too vague. For example, let’s say we start with this config:

receivers:
  myreceiver:
    type: files
    include_paths:
    - /var/log/myservice/info/*.log
    - /var/log/myservice/debug/*.log

And this later config is encountered:

receivers:
  myreceiver:
    type: files
    include_paths:
    - /var/log/myservice/warn/*.log

Should we result with include_paths: ['/var/log/myservice/info/*.log', '/var/log/myservice/debug/*.log', '/var/log/myservice/warn/*.log'] or include_paths: ['/var/log/myservice/warn/*.log']? You raise a valid point, and there is probably no ”right” answer to this.

Currently, the documentation tells you how to remove the default logging pipeline like this:

logging:
  service:
    pipelines:
      default_pipeline:
        receivers: []

Which sounds like the current behaviour is that a list will overwrite an existing list (not merged).

But yeah, this is subject to edge cases like what if, for example, prior config files defined receivers: [{'a':...}, {'b':...}, {'c':...}], and a later config wanted to override this by deleting only receiver c but leaving a and b intact.

What I would say is that for our team’s usage, I expect that the config would be mostly additive. For example, the platform team would probably care about platform–specific pipelines, and service teams would probably care about their own service but not being able to override platform pipelines is probably okay.

But I humbly acknowledge that doesn’t resolve all the edge cases.

Regarding JSON, that probably suffers from the same problem of deciding how to merge keys if the same ones are defined. I haven’t tested what happens, but we appear to be using yaml.Strict():

if err := yaml.UnmarshalContext(ctx, input, &config, yaml.Strict(), yaml.Validator(v)); err != nil {

Which seems to set disallowDuplicateKey = true:

https://github.com/goccy/go-yaml/blob/5f46a66b01363dc92d9fca41d7d79822607b4192/option.go#L53-L59

So with the JSON approach (assuming no changes to current behaviour) we’d still need to be strict about what structure users are expected to customise, or do our own pre-merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants