Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(processors.regex): Allow batch transforms using named groups #13971

Merged
merged 7 commits into from
Sep 28, 2023

Conversation

srebhan
Copy link
Member

@srebhan srebhan commented Sep 21, 2023

resolves #5505

This PR restructures and simplifies the regexp-processor and then implements the possibility to transform multiple tags/fields using named capture-groups. In the course, it unifies the ability to specify wildcards on tags and fields (previously only possible on tags) by using standard filter expressions.
Finally, it improves the README explaining behavior of the possible operations and modi in more detail.

@telegraf-tiger telegraf-tiger bot added feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin plugin/processor labels Sep 21, 2023
@srebhan srebhan added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Sep 25, 2023
Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As usual the coe looks fine, but I'll go through it again during a final review. My questions are around the config and ensuring we aren't breaking existing users.

I also question the addition of the append boolean. This seems like scope creep.

edit: I of course forgot to say, thank you for this. This was clearly a ton of work ;) The refactor makes this much easier to view in the future.

For metrics transforms, `key` denotes the element that should be
transformed. Furthermore, `result_key` allows control over the behavior applied
in case the resulting `tag` or `field` name already exists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed these comments? Are the covered below now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes as the behavior is now described in the sample.conf part in my view...

plugins/processors/regex/README.md Outdated Show resolved Hide resolved
plugins/processors/regex/README.md Outdated Show resolved Hide resolved
plugins/processors/regex/README.md Outdated Show resolved Hide resolved
plugins/processors/regex/README.md Outdated Show resolved Hide resolved
plugins/processors/regex/README.md Outdated Show resolved Hide resolved
plugins/processors/regex/README.md Outdated Show resolved Hide resolved
apply the conversion. If any of the given criteria does not apply the conversion
is not applied to the metric.

The `replacement` option specifies the value of the resulting tag or field. It
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be another header here for replacement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it describes how the replacement setting is interpreted in this setup. That's also why it repeats for every subsection...

to be named in the `pattern`. Additional non-capturing ones or other
expressions are allowed. Furthermore, neither `replacement` nor `result_key`
can be set as the resulting tag/field name is the name of the group and the
value corresponds to the group's content.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth an example below here? Or even below each of these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an example in ### Named groups and I think also for each of the sections...

plugins/processors/regex/README.md Outdated Show resolved Hide resolved
@telegraf-tiger
Copy link
Contributor

Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

🥳 This pull request decreases the Telegraf binary size by -2.15 % for linux amd64 (new size: 197.7 MB, nightly size 202.0 MB)

📦 Click here to get additional PR build artifacts

Artifact URLs

DEB RPM TAR GZ ZIP
amd64.deb aarch64.rpm darwin_amd64.tar.gz windows_amd64.zip
arm64.deb armel.rpm darwin_arm64.tar.gz windows_arm64.zip
armel.deb armv6hl.rpm freebsd_amd64.tar.gz windows_i386.zip
armhf.deb i386.rpm freebsd_armv7.tar.gz
i386.deb ppc64le.rpm freebsd_i386.tar.gz
mips.deb riscv64.rpm linux_amd64.tar.gz
mipsel.deb s390x.rpm linux_arm64.tar.gz
ppc64el.deb x86_64.rpm linux_armel.tar.gz
riscv64.deb linux_armhf.tar.gz
s390x.deb linux_i386.tar.gz
linux_mips.tar.gz
linux_mipsel.tar.gz
linux_ppc64le.tar.gz
linux_riscv64.tar.gz
linux_s390x.tar.gz

@powersj powersj merged commit d07701f into influxdata:master Sep 28, 2023
4 checks passed
@github-actions github-actions bot added this to the v1.29.0 milestone Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin plugin/processor ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow creation of multiple tags/fields with regex processor
2 participants