Skip to content

Commit

Permalink
Flag categorize labels on streams response (#10419)
Browse files Browse the repository at this point in the history
We recently introduced support for ingesting and querying structured
metadata in Loki. This adds a new dimension to Loki's labels since now
we arguably have three categories of labels: _stream_, _structured
metadata_, and _parsed_ labels.

Depending on the origin of the labels, they should be used in LogQL
expressions differently to achieve optimal performance. _stream_ labels
should be added to stream matchers, _structured metadata_ labels should
be used in a filter expression before any parsing expression, and
_parsed_ labels should be placed after the parser expression extracting
them.

The Grafana UI has a hard time dealing with this same problem. Before
grafana/grafana#73955, the filtering
functionality in Grafana was broken since it was not able to distinguish
between _stream_ and _structured metadata_ labels. Also, as soon as a
parser expression was added to the query, filters added by Grafana would
be appended to the end of the query regardless of the label category.
The PR above implements a workaround for this problem but needs a better
API on Loki's end to mitigate all corner cases.

Loki currently returns the following JSON for log queries:
```json
...
{
	"stream": {
	  "cluster": "us-central",
	  "container": "query-frontend",
	  "namespace": "loki",
	  "level": "info",
	  "traceID": "68810cf0c94bfcca"
	},
	 "values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 aaaaaaaaa.....\n"
                    ],
          ...
},
{
	"stream": {
	  "cluster": "us-central",
	  "container": "query-frontend",
	  "namespace": "loki",
	  "level": "debug",
	  "traceID": "a7116cj54c4bjz8s"
	},
	 "values": [
                    [
                        "1693996529000222497",
                        "1693996529000222497 bbbbbbbbb.....\n"
                    ],
          ...
},
...
```

As can be seen, there is no way we can distinguish the category of each
label.

This PR introduces a new flag `X-Loki-Response-Encoding-Flags:
categorize-labels` that makes Loki return categorized labels as follows:

```json
...
{
        "stream": {
	  "cluster": "us-central",
	  "container": "query-frontend",
	  "namespace": "loki",
	},
	"values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 aaaaaaaaa.....\n",
                       {
                           "structuredMetadata": {
                                "traceID": "68810cf0c94bfcca"
                           },
                          "parsed": {
                                 "level": "info"
                          }
                       }
                    ],
                    [
                        "1693996529000222497",
                        "1693996529000222497 bbbbbbbbb.....\n",
                       {
                           "structuredMetadata": {
                                "traceID": "a7116cj54c4bjz8s"
                           },
                          "parsed": {
                                 "level": "debug"
                          }
                       }
                    ],
    ...
},
...
```

Note that this PR only supports log queries, not metric queries. From a
UX perspective, being able to categorize labels in metric queries
doesn't have any benefit yet. Having said that, supporting this for
metric queries would require some minor refactoring on top of what has
been implemented here. If we decide to do that, I think we should do it
on a separate PR to avoid making this PR even larger.

I also decided to leave out support for Tail queries to avoid making
this PR even larger. Once this one gets merged, we can work to support
tailing.

---

**Note to reviewers**

This PR is huge since we need to forward categorized all over the
codebase (from parsing logs all the way to marshaling), fortunately,
many of the changes come from updating tests and refactoring iterators.

Tested out in a dev cell with query `'{stream="stdout"} | label_format
new="text"`.
- Without the new flag:
```
$ http http://127.0.0.1:3100/loki/api/v1/query_range\?direction\=BACKWARD\&end\=1693996529322486000\&limit\=30\&query\=%7Bstream%3D%22stdout%22%7D+%7C+label_format+new%3D%22text%22\&start\=1693992929322486000 X-Scope-Orgid:REDACTED
{
    "data": {
        "result": [
            {
                "stream": {
                    "new": "text",
                    "pod": "loki-canary-986bd6f4b-xqmb7",
                    "stream": "stdout"
                },
                "values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 pppppppppppp...\n"
                    ],
                    [
                        "1693996528499160852",
                        "1693996528499160852 pppppppppppp...\n"
                    ],
...
```

- With the new flag
```
$ http http://127.0.0.1:3100/loki/api/v1/query_range\?direction\=BACKWARD\&end\=1693996529322486000\&limit\=30\&query\=%7Bstream%3D%22stdout%22%7D+%7C+label_format+new%3D%22text%22\&start\=1693992929322486000 X-Scope-Orgid:REDACTED X-Loki-Response-Encoding-Flags:categorize-labels
{
    "data": {
        "encodingFlags": [
            "categorize-labels"
        ],
        "result": [
            {
                "stream": {
                    "pod": "loki-canary-986bd6f4b-xqmb7",
                    "stream": "stdout"
                },
                "values": [
                    [
                        "1693996529000222496",
                        "1693996529000222496 pppppppppppp...\n",
                        {
                            "parsed": {
                                "new": "text"
                            }
                        }
                    ],
                    [
                        "1693996528499160852",
                        "1693996528499160852 pppppppppppp...\n",
                        {
                            "parsed": {
                                "new": "text"
                            }
                        }
                    ],
...
```
  • Loading branch information
salvacorts authored Oct 25, 2023
1 parent 60ea954 commit 52a3f16
Show file tree
Hide file tree
Showing 46 changed files with 2,190 additions and 813 deletions.
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ require (
github.com/efficientgo/core v1.0.0-rc.2
github.com/fsnotify/fsnotify v1.6.0
github.com/gogo/googleapis v1.4.0
github.com/grafana/loki/pkg/push v0.0.0-20231017172654-cfc4f0e84adc
github.com/grafana/loki/pkg/push v0.0.0-20231023154132-0a7737e7c7eb
github.com/heroku/x v0.0.61
github.com/influxdata/tdigest v0.0.2-0.20210216194612-fc98d27c9e8b
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/prometheus v0.86.0
Expand Down
69 changes: 55 additions & 14 deletions integration/client/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import (
"strings"
"time"

"github.com/buger/jsonparser"
"github.com/grafana/dskit/user"
"github.com/prometheus/prometheus/model/labels"
"go.opentelemetry.io/collector/pdata/pcommon"
Expand Down Expand Up @@ -335,10 +336,40 @@ func (c *Client) GetDeleteRequests() (DeleteRequests, error) {
return deleteReqs, nil
}

type Entry []string

func (e *Entry) UnmarshalJSON(data []byte) error {
if *e == nil {
*e = make([]string, 0, 3)
}

var parseError error
_, err := jsonparser.ArrayEach(data, func(value []byte, t jsonparser.ValueType, _ int, _ error) {
// The TS and the lines are strings. The labels are a JSON object.
// but we will parse them as strings.
if t != jsonparser.String && t != jsonparser.Object {
parseError = jsonparser.MalformedStringError
return
}

v, err := jsonparser.ParseString(value)
if err != nil {
parseError = err
return
}
*e = append(*e, v)
})

if parseError != nil {
return parseError
}
return err
}

// StreamValues holds a label key value pairs for the Stream and a list of a list of values
type StreamValues struct {
Stream map[string]string
Values [][]string
Values []Entry
}

// MatrixValues holds a label key value pairs for the metric and a list of a list of values
Expand Down Expand Up @@ -377,17 +408,19 @@ func (a *VectorValues) UnmarshalJSON(b []byte) error {

// DataType holds the result type and a list of StreamValues
type DataType struct {
ResultType string
Stream []StreamValues
Matrix []MatrixValues
Vector []VectorValues
ResultType string
Stream []StreamValues
Matrix []MatrixValues
Vector []VectorValues
EncodingFlags []string
}

func (a *DataType) UnmarshalJSON(b []byte) error {
// get the result type
var s struct {
ResultType string `json:"resultType"`
Result json.RawMessage `json:"result"`
ResultType string `json:"resultType"`
EncodingFlags []string `json:"encodingFlags"`
Result json.RawMessage `json:"result"`
}
if err := json.Unmarshal(b, &s); err != nil {
return err
Expand All @@ -410,6 +443,7 @@ func (a *DataType) UnmarshalJSON(b []byte) error {
return fmt.Errorf("unknown result type %s", s.ResultType)
}
a.ResultType = s.ResultType
a.EncodingFlags = s.EncodingFlags
return nil
}

Expand All @@ -434,12 +468,16 @@ type Rules struct {
Rules []interface{}
}

type Header struct {
Name, Value string
}

// RunRangeQuery runs a query and returns an error if anything went wrong
func (c *Client) RunRangeQuery(ctx context.Context, query string) (*Response, error) {
func (c *Client) RunRangeQuery(ctx context.Context, query string, extraHeaders ...Header) (*Response, error) {
ctx, cancelFunc := context.WithTimeout(ctx, requestTimeout)
defer cancelFunc()

buf, statusCode, err := c.run(ctx, c.rangeQueryURL(query))
buf, statusCode, err := c.run(ctx, c.rangeQueryURL(query), extraHeaders...)
if err != nil {
return nil, err
}
Expand All @@ -448,7 +486,7 @@ func (c *Client) RunRangeQuery(ctx context.Context, query string) (*Response, er
}

// RunQuery runs a query and returns an error if anything went wrong
func (c *Client) RunQuery(ctx context.Context, query string) (*Response, error) {
func (c *Client) RunQuery(ctx context.Context, query string, extraHeaders ...Header) (*Response, error) {
ctx, cancelFunc := context.WithTimeout(ctx, requestTimeout)
defer cancelFunc()

Expand All @@ -463,7 +501,7 @@ func (c *Client) RunQuery(ctx context.Context, query string) (*Response, error)
u.Path = "/loki/api/v1/query"
u.RawQuery = v.Encode()

buf, statusCode, err := c.run(ctx, u.String())
buf, statusCode, err := c.run(ctx, u.String(), extraHeaders...)
if err != nil {
return nil, err
}
Expand Down Expand Up @@ -617,18 +655,21 @@ func (c *Client) Series(ctx context.Context, matcher string) ([]map[string]strin
return values.Data, nil
}

func (c *Client) request(ctx context.Context, method string, url string) (*http.Request, error) {
func (c *Client) request(ctx context.Context, method string, url string, extraHeaders ...Header) (*http.Request, error) {
ctx = user.InjectOrgID(ctx, c.instanceID)
req, err := http.NewRequestWithContext(ctx, method, url, nil)
if err != nil {
return nil, err
}
req.Header.Set("X-Scope-OrgID", c.instanceID)
for _, h := range extraHeaders {
req.Header.Add(h.Name, h.Value)
}
return req, nil
}

func (c *Client) run(ctx context.Context, u string) ([]byte, int, error) {
req, err := c.request(ctx, "GET", u)
func (c *Client) run(ctx context.Context, u string, extraHeaders ...Header) ([]byte, int, error) {
req, err := c.request(ctx, "GET", u, extraHeaders...)
if err != nil {
return nil, 0, err
}
Expand Down
2 changes: 1 addition & 1 deletion integration/loki_micro_services_delete_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -408,7 +408,7 @@ func getMetricValue(t *testing.T, metricName, metrics string) float64 {
}

func pushRequestToClientStreamValues(t *testing.T, p pushRequest) []client.StreamValues {
logsByStream := map[string][][]string{}
logsByStream := map[string][]client.Entry{}
for _, entry := range p.entries {
lb := labels.NewBuilder(labels.FromMap(p.stream))
for _, l := range entry.StructuredMetadata {
Expand Down
Loading

0 comments on commit 52a3f16

Please sign in to comment.