Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add pattern match line filter #12398

Merged
merged 6 commits into from
Apr 2, 2024
Merged

Conversation

kolesnikovae
Copy link
Contributor

@kolesnikovae kolesnikovae commented Mar 29, 2024

The PR introduces new line filter operators for pattern matching, |> and !>. Despite the fact that filtering by patterns is a subset of the regex filter scope, we decided to extend the syntax explicitly.

The filter presumes that placeholders mask arbitrary non-empty literals. Thus, the "<_>" pattern matches any non-empty line, and the "" pattern only matches empty lines. The semantics are identical to the pattern parse stage, with an exception that named captures are not allowed: fields are never extracted at filtering.

The PR does not include documentation. At this point, the new feature is somewhat experimental and should be used with caution. The same applies to the frontend query builder and syntax checker.

Fixes #12347

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • CHANGELOG.md updated
    • If the change is worth mentioning in the release notes, add add-to-release-notes label
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

ErrNoCapture = errors.New("at least one capture is required")
ErrInvalidExpr = errors.New("invalid expression")
ErrNoCapture = errors.New("at least one capture is required")
ErrCaptureNotAllowed = errors.New("named captures are not allowed")
Copy link
Contributor Author

@kolesnikovae kolesnikovae Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is debatable, but I think it makes sense: if a user tries to parse fields at the filter stage, it's better to return an explicit error rather than silently ignore the user's intent. In the future, the restriction can be removed.

Note that we're using an unnamed placeholder <_>. I think we could use a new capture identifier (such as <*>) to emphasize the difference with the pattern parse stage. However, it feels like a new syntax for pattern matching, which may confuse users. I would like to hear others' thoughts on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinitions using _ vs * but I definitively agree we should fail fast.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for <_> as the behavior between these two examples feels consistent to me:

|> "dur=<_> err=<_>"
| pattern "dur=<_> err=<_>"

I think of that as telling Loki to ignore any of the content in <_>

I'm not sure I see the reason to use <*> or maybe I'm misunderstanding something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This popped up in the Slack conversation, so I brought it here.

Thank you, guys, for sharing your thoughts. I keep <_>.

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me !

@@ -539,6 +581,10 @@ func TestStringer(t *testing.T) {
in: `{app="foo"} !~ ip("127.0.0.1") or "foo"`,
out: `{app="foo"} !~ ip("127.0.0.1") !~ "foo"`,
},
{
in: `{app="foo"} !> "<_> foo <_>" or "foo <_>" !> "foo <_> baz"`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the or is a bug here and should not be support but I see it elsewhere. I didn't know about this cc @kavirajk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks legit to me: note the De Morgan's law a few lines above (!(A || B) == !A && !B):

not match ("<_> foo <_>" or "foo <_>") is the same as
not match "<_> foo <_>" and not match "foo <_>"

@cyriltovena
Copy link
Contributor

please add a changelog item btw !

@kolesnikovae kolesnikovae marked this pull request as ready for review April 1, 2024 05:22
@kolesnikovae kolesnikovae requested review from a team as code owners April 1, 2024 05:22
@cyriltovena cyriltovena merged commit 36c703d into main Apr 2, 2024
10 checks passed
@cyriltovena cyriltovena deleted the kolesnikovae-pattern-filter branch April 2, 2024 06:47
grafanabot added a commit that referenced this pull request Apr 2, 2024
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filter with the pattern
3 participants