Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpanKind support for badger #6376

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

Conversation

Manik2708
Copy link
Contributor

Which problem is this PR solving?

Description of the changes

  • Queries with span kind will now be supported for Badger

How was this change tested?

  • Writing unit tests

Checklist

@Manik2708 Manik2708 requested a review from a team as a code owner December 17, 2024 07:43
@Manik2708 Manik2708 requested a review from jkowall December 17, 2024 07:43
@dosubot dosubot bot added enhancement storage/badger Issues related to badger storage labels Dec 17, 2024
@Manik2708
Copy link
Contributor Author

Manik2708 commented Dec 17, 2024

I have changed the structure of cache which is leading to these concerns:

  1. Will a 3D map be a viable option for production?
  2. Cache will never be able to retrieve operations of old data! When kind is not sent by the user, all operations related to new data will be sent. I have a probable solution for this! We might have to introduce boolean which when true will load the cache from old data (old index key) and mark all the span of kind UNSPECIFIED
  3. To maintain consistency, we must take the service name from the newly created index, but extracting service name from serviceName+operationName+kind is the challenge! The solution which I have thought is reserving the last 7 places for len(serviceName)+len(operationName)+kind in the new index. This has an issue that we have to limit the length of serviceName and operationName to 999. This way we can get rid of the c.services map also. Removing this map is optional and a matter of discussion because for this we have to decide between storage and iteration, removing this map will lead to extra iterations in GetServices, I also thought of a solution for this:
data = map[string]struct
// Here this struct can be defined as
type struct {
expiryTime uint64
operations map[trace.SpanKind]map[string]uint64
}

Once the correct approach is discussed I will handle some more edge cases and make the e2e tests pass (making GetOperationsMissingSpanKind: false!

Copy link

codecov bot commented Dec 17, 2024

Codecov Report

Attention: Patch coverage is 92.05021% with 19 lines in your changes missing coverage. Please review.

Project coverage is 96.15%. Comparing base (b04e0ba) to head (4747b0a).

Files with missing lines Patch % Lines
plugin/storage/badger/spanstore/writer.go 90.00% 5 Missing and 2 partials ⚠️
plugin/storage/badger/spanstore/reader.go 91.04% 4 Missing and 2 partials ⚠️
plugin/storage/badger/spanstore/kind.go 80.00% 4 Missing and 1 partial ⚠️
plugin/storage/badger/spanstore/cache.go 98.52% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6376      +/-   ##
==========================================
- Coverage   96.24%   96.15%   -0.10%     
==========================================
  Files         373      374       +1     
  Lines       21406    21588     +182     
==========================================
+ Hits        20602    20757     +155     
- Misses        612      632      +20     
- Partials      192      199       +7     
Flag Coverage Δ
badger_v1 11.12% <43.51%> (+0.48%) ⬆️
badger_v2 2.73% <0.00%> (-0.05%) ⬇️
cassandra-4.x-v1-manual 16.31% <0.00%> (-0.28%) ⬇️
cassandra-4.x-v2-auto 2.66% <0.00%> (-0.05%) ⬇️
cassandra-4.x-v2-manual 2.66% <0.00%> (-0.05%) ⬇️
cassandra-5.x-v1-manual 16.31% <0.00%> (-0.28%) ⬇️
cassandra-5.x-v2-auto 2.66% <0.00%> (-0.05%) ⬇️
cassandra-5.x-v2-manual 2.66% <0.00%> (-0.05%) ⬇️
elasticsearch-6.x-v1 20.02% <0.00%> (-0.35%) ⬇️
elasticsearch-7.x-v1 20.09% <0.00%> (-0.36%) ⬇️
elasticsearch-8.x-v1 20.25% <0.00%> (-0.35%) ⬇️
elasticsearch-8.x-v2 2.73% <0.00%> (-0.15%) ⬇️
grpc_v1 11.95% <0.00%> (-0.21%) ⬇️
grpc_v2 8.86% <0.00%> (-0.17%) ⬇️
kafka-3.x-v1 10.15% <0.00%> (-0.18%) ⬇️
kafka-3.x-v2 2.73% <0.00%> (-0.05%) ⬇️
memory_v2 2.73% <0.00%> (-0.04%) ⬇️
opensearch-1.x-v1 20.14% <0.00%> (-0.35%) ⬇️
opensearch-2.x-v1 20.14% <0.00%> (-0.36%) ⬇️
opensearch-2.x-v2 2.72% <0.00%> (-0.06%) ⬇️
tailsampling-processor 0.50% <0.00%> (-0.01%) ⬇️
unittests 95.01% <92.05%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Manik2708
Copy link
Contributor Author

I have changed the structure of cache which is leading to these concerns:

  1. Will a 3D map be a viable option for production?
  2. Cache will never be able to retrieve operations of old data! When kind is not sent by the user, all operations related to new data will be sent. I have a probable solution for this! We might have to introduce boolean which when true will load the cache from old data (old index key) and mark all the span of kind UNSPECIFIED
  3. To maintain consistency, we must take the service name from the newly created index, but extracting service name from serviceName+operationName+kind is the challenge! The solution which I have thought is reserving the last 7 places for len(serviceName)+len(operationName)+kind in the new index. This has an issue that we have to limit the length of serviceName and operationName to 999. This way we can get rid of the c.services map also. Removing this map is optional and a matter of discussion because for this we have to decide between storage and iteration, removing this map will lead to extra iterations in GetServices, I also thought of a solution for this:
data = map[string]struct
// Here this struct can be defined as
type struct {
expiryTime uint64
operations map[trace.SpanKind]map[string]uint64
}

Once the correct approach is discussed I will handle some more edge cases and make the e2e tests pass (making GetOperationsMissingSpanKind: false!

@yurishkuro Please review the approach and problems!

@Manik2708
Copy link
Contributor Author

@yurishkuro I have added more changes which reduces the iterations in prefill to 1 but it limits the serviceName to length of 999. Please review!

@Manik2708
Copy link
Contributor Author

Manik2708 commented Dec 19, 2024

I have an idea for old data without using the migration script! We can store the old data in two other data structures in cache (without kind). But then the only question which rises then: What to return when no span kind is given by user? Operations of new data of all kind or operations of old data (kind marked as unspecified) or an addition of both?

@yurishkuro yurishkuro added the changelog:new-feature Change that should be called out as new feature in CHANGELOG label Dec 20, 2024
model/span.go Outdated Show resolved Hide resolved
model/span.go Outdated Show resolved Hide resolved
model/span.go Outdated Show resolved Hide resolved
@yurishkuro
Copy link
Member

What to return when no span kind is given by user?

then we should return all operations regardless of the span kind

@Manik2708
Copy link
Contributor Author

What to return when no span kind is given by user?

then we should return all operations regardless of the span kind

That means including all spans of old data also (Whose kind is not there in cache)?

@Manik2708 Manik2708 marked this pull request as draft December 22, 2024 14:04
@Manik2708 Manik2708 marked this pull request as ready for review December 22, 2024 19:16
@dosubot dosubot bot added the area/storage label Dec 22, 2024
@Manik2708
Copy link
Contributor Author

My current approach is leading to errors in unit test of factory_test.go. The badger is throwing this error infinetly times:

runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1700, retrying
badger 2024/12/23 01:12:11 ERROR: error flushing memtable to disk: error while creating table err: while creating table: /tmp/badger116881967/000002.sst error: open /tmp/badger116881967/000002.sst: no such file or directory
unable to open: /tmp/badger116881967/000002.sst
github.com/dgraph-io/ristretto/v2/z.OpenMmapFile

This is probably because f.Close is closed before the completion of prefill. That implies creation of new index for old data is slow. Hence I think we have only one way, if we want to skip even auto migration and that is using this function:

func getSpanKind(txn *badger.Txn, service string, timestampAndTraceId string) model.SpanKind {
	for i := 0; i < 6; i++ {
		value := service + model.SpanKindKey + model.SpanKind(i).String()
		valueBytes := []byte(value)
		operationKey := make([]byte, 1+len(valueBytes)+8+sizeOfTraceID)
		operationKey[0] = tagIndexKey
		copy(operationKey[1:], valueBytes)
		copy(operationKey[1+len(valueBytes):], timestampAndTraceId)
		_, err := txn.Get(operationKey)
		if err == nil {
			return model.SpanKind(i)
		}
	}
	return model.SpanKindUnspecified
}

The only problem is that, during prefilling 6*NumberOfOperations Get Queries will be called. Please review this approach @yurishkuro and I think we need to discuss about autoCreation of new index or should we skip the creation of any new index and use the function given above?

@Manik2708 Manik2708 requested a review from yurishkuro December 23, 2024 19:28
@Manik2708 Manik2708 marked this pull request as draft December 26, 2024 02:07
@Manik2708 Manik2708 marked this pull request as ready for review December 26, 2024 05:22
@Manik2708
Copy link
Contributor Author

@yurishkuro I finally got rid of migration and now I think its ready for review! Please ignore my previous comments. The current commit has no linkage them!

Signed-off-by: Manik2708 <[email protected]>
@Manik2708
Copy link
Contributor Author

can you revisit the tests by using API methods of the cache instead of manually manipulating its internal data structures? Tests should be validating expected behavior that the user of the cache expects. The only time it's acceptable to go into internal details is when some error conditions cannot be tested otherwise purely from external API.

I have fixed all the tests except that of Update and Prefill, even they are also not manipulating the data structure, they are used just to check whether cache is storing by using the update or prefill

@Manik2708 Manik2708 requested a review from yurishkuro December 30, 2024 08:19
@Manik2708
Copy link
Contributor Author

@yurishkuro Can you please review?

@Manik2708 Manik2708 marked this pull request as draft January 2, 2025 09:31
Signed-off-by: Manik2708 <[email protected]>
@Manik2708 Manik2708 marked this pull request as ready for review January 2, 2025 16:09
@Manik2708 Manik2708 requested a review from yurishkuro January 2, 2025 16:12
@yurishkuro
Copy link
Member

Q: do we have to maintain two indices forever, or is this only a side-effect of having to be backwards compatible with the existing data?

For example, one way I could see this working is:

  • we only write the new index with kind
  • when reading, we do a dual lookup, first in the new index then in the old (if the old exists)
  • we have a config option to turn off the dual-reading behavior. The motivation here is that people rarely keep tracing data for very long, so in 4 months (4 releases) the old index is likely going to be TTLed out anyway.
    • In the first release of the feature this option could be defaulted to ON
    • Then a couple releases down the road we can default it to OFF
    • Then 2 more releases down the road we deprecate the option and remove the old index reading code.

@Manik2708
Copy link
Contributor Author

Q: do we have to maintain two indices forever, or is this only a side-effect of having to be backwards compatible with the existing data?

For example, one way I could see this working is:

  • we only write the new index with kind

  • when reading, we do a dual lookup, first in the new index then in the old (if the old exists)

  • we have a config option to turn off the dual-reading behavior. The motivation here is that people rarely keep tracing data for very long, so in 4 months (4 releases) the old index is likely going to be TTLed out anyway.

    • In the first release of the feature this option could be defaulted to ON
    • Then a couple releases down the road we can default it to OFF
    • Then 2 more releases down the road we deprecate the option and remove the old index reading code.

The key : serviceName+Kind+OperationName+Time+TraceId can't be used in the reader to find trace ids. Because while finding trace ids we might not be aware of the kind. We can avoid dual lookups while prefilling by your suggested roadmap. This key schema was also discussed in the issue and it was asked in the comment #1922 (comment). If we want to use this key schema permanently then employ a different key: serviceName+OperationName+kind+Time+TraceId while scanning the indexes we have to create this key from service and operation. So when a TraceQueryParameter with only service name and operation name is there while scanning we have to append 6 keys so as to fetch all trace ids. Please have a look at this

serviceName = "service"
operationName = "operation"
//So in the scanning we have to create the following //6 keys:
key1 = "serviceoperation0"
key2 = "serviceoperation1"
...

Then finding the trace ids would also work fine. So either we have to create an extra index or do this extra scanning!

@yurishkuro
Copy link
Member

yurishkuro commented Jan 2, 2025

serviceName+OperationName+kind+Time+TraceId

This index doesn't make sense to me. It cannot effectively support a query that only includes service+operation, you must always know the kind to get to the desired time range.

Wouldn't it make more sense to append the kind after the Time? Then we have the following two queries:

  1. user does not specify kind - we scan everything within the given time range
  2. user does specify kind - we still scan everything within the given time range and discard entries with the wrong kind. As you mentioned earlier, the probability of having different kinds for the same service+operation is quite low, so even if it does happen, in the worse case we'd have to scan 5x more entries (kind can have 5 different values), but that worse case will almost never happen because in most cases it will be exactly 1 value.

@Manik2708
Copy link
Contributor Author

serviceName+OperationName+kind+Time+TraceId

This index doesn't make sense to me. It cannot effectively support a query that only includes service+operation, you must always know the kind to get to the desired time range.

Wouldn't it make more sense to append the kind after the Time? Then we have the following two queries:

  1. user does not specify kind - we scan everything within the given time range
  2. user does specify kind - we still scan everything within the given time range and discard entries with the wrong kind. As you mentioned earlier, the probability of having different kinds for the same service+operation is quite low, so even if it does happen, in the worse case we'd have to scan 5x more entries (kind can have 5 different values), but that worse case will almost never happen because in most cases it will be exactly 1 value.

We can try this but then we need to remember that it will break these conventions:

  1. Last 16 bytes of trace is trace id
  2. Then 8 bytes of time stamp
    Only this key will be breaking these. Also this key need not to be present when tags are there. So we need prepare two seperate logics of scanning and parsing.

@yurishkuro
Copy link
Member

Why is it "breaking" if kind introduced after Time but not "breaking" when it's before Time?

Whatever we do the changes must be backwards compatible.

@Manik2708
Copy link
Contributor Author

Manik2708 commented Jan 2, 2025

Why is it "breaking" if kind introduced after Time but not "breaking" when it's before Time?

Whatever we do the changes must be backwards compatible.

Please see this:

func createIndexKey(indexPrefixKey byte, value []byte, startTime uint64, traceID model.TraceID) []byte {
// KEY: indexKey<indexValue><startTime><traceId> (traceId is last 16 bytes of the key)
key := make([]byte, 1+len(value)+8+sizeOfTraceID)
key[0] = (indexPrefixKey & indexKeyRange) | spanKeyPrefix
pos := len(value) + 1
copy(key[1:pos], value)
binary.BigEndian.PutUint64(key[pos:], startTime)
pos += 8 // sizeOfTraceID / 2
binary.BigEndian.PutUint64(key[pos:], traceID.High)
pos += 8 // sizeOfTraceID / 2
binary.BigEndian.PutUint64(key[pos:], traceID.Low)
return key

This is how we are creating a key, when service+operation+kind is used it is used as value here but appending it after time will break this.

@yurishkuro
Copy link
Member

why does it matter? We're creating an index with a different layout, we don't have to be restricted by how that specific function is implemented, especially since we are introducing a different look up process (it seems all other indices are doing direct lookup by the prefix up to the timestamp and then scan / parse).

@Manik2708
Copy link
Contributor Author

why does it matter? We're creating an index with a different layout, we don't have to be restricted by how that specific function is implemented, especially since we are introducing a different look up process (it seems all other indices are doing direct lookup by the prefix up to the timestamp and then scan / parse).

Ok, will give it a try and get back to you! Thanks for your time!

@Manik2708
Copy link
Contributor Author

@yurishkuro I have tried to take care of all the edge cases, please review!

@Manik2708
Copy link
Contributor Author

@yurishkuro This PR is ready to review, I have added dual lookups and backward compatibility tests in this PR.

@@ -42,6 +42,9 @@ type Config struct {
// ReadOnly opens the data store in read-only mode. Multiple instances can open the same
// store in read-only mode. Values still in the write-ahead-log must be replayed before opening.
ReadOnly bool `mapstructure:"read_only"`
// DualLookUp enables the look-up in old indexes of jaeger which are not deprecated.
// By-default it is enabled and it is suggested not to disable it
DualLookUp bool `mapstructure:"dual_look_up"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not expose this in the config. We should use a feature gate instead - see #6568

@@ -72,6 +75,7 @@ func DefaultConfig() *Config {
},
MaintenanceInterval: defaultMaintenanceInterval,
MetricsUpdateInterval: defaultMetricsUpdateInterval,
DualLookUp: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read default value from feature gate

}
err := writer.writeSpanWithOldIndex(&oldSpan)
require.NoError(t, err)
traces, err := reader.FindTraces(context.Background(), &spanstore.TraceQueryParameters{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I follow this test. What does FindTraces have to do with span kind in the operations retrieval? Also, backwards compatibility test only makes sense when it is executed against old and new code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have changed the key but we need to make sure that traces are also fetched from old key when dual lookup is turned on. Please stress on a fact that operation key is used in getting traces also along with filling in cache, If you will look at this code, we are first writing span with old key and then testing whether it is able to fetch traces associated with that key (please see L42)

}
*/
// The uint64 value is the expiry time of operation
operations map[string]map[model.SpanKind]map[string]uint64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify, CacheStore is used to avoid expensive scans when loading services and operations, correct? In other words, it's all in-memory structure. In this case, why can we not change just the value of the map to be a combo {kind, expiration} instead of changing the structure? When loading, scanning everything for a give service is still going to be negligible amount of data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't understand this! Are you saying to keep these structures?

services map[string]uint64 // Already in the cache
operations map[string][string]kind
type kind struct {
    kind SpanKind
   expiry uint64
}

If yes, then how to handle when query is to fetch all operations for a service and kind? Should we iterate all operations and skip those operations which are not of the required kind? (We are using a similar approach currently, i.e iteralting for all kinds and skipping unrequired kinds but this was justified because max kinds can be 6 but number of operations aren't defined, so will this option viable?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So iterating all operations and skipping not required kinds will be right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@@ -77,6 +93,7 @@ func (c *CacheStore) loadServices() {
func (c *CacheStore) loadOperations(service string) {
c.store.View(func(txn *badger.Txn) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a side note, but I find it very bizarre that Cache struct has the db loading logic, such logic should only be in the reader, and either Cache delegates to the reader or the reader delegates to the cache. Maybe a separate PR to refactor that first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But have a question over this: Currently Reader requires cache to serve operations and services and if we want cache to read from Reader wouldn't it create circular dependency?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The end user is only exposed to the Reader API, using cache internally is reader's business, it can orchestrate interaction with the cache as it wants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I mean to say badger is the first object to get instantiated. Cache depends on store which is then instantiated and then Reader and Writer needs to instantiate both of which depends on store and cache. So when cache is as elemental as badger store, how it can depend on reader? And shouldn't cache be directly contacting the db instead of getting dependent on some other service?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you heard of "clean architecture"? I'm not a big fan of it overall but the fundamental principles are sound. So how would you describe the responsibilities of each component here? What is the onion structure? Each component should perform well encapsulated function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion, I was wondering this without looking at the code! Thanks for your time and reply!

type badgerSpanKind int

const (
badgerSpanKindUnspecified badgerSpanKind = iota
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you are using this as database values, please don't use implicit assignment via iota, give the concrete values that will never change. iota-based enum can change if the order is changed.

entriesToStore,
w.createBadgerEntry(
createOperationWithKindIndexKey(
[]byte(span.Process.ServiceName+span.OperationName),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please pass these separately. Is there any other way to call createOperationWithKindIndexKey?

@@ -128,6 +142,22 @@ func createIndexKey(indexPrefixKey byte, value []byte, startTime uint64, traceID
return key
}

func createOperationWithKindIndexKey(value []byte, startTime uint64, traceID model.TraceID, kind model.SpanKind) []byte {
// KEY: indexKey<indexValue><startTime><spanKind><traceId> (traceId is last 16 bytes of the key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spell out indexValue

copy(key[1:pos], value)
binary.BigEndian.PutUint64(key[pos:], startTime)
pos += 8
key[pos] = getBadgerSpanKind(kind).String()[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you're reading the first character of the string - this is a bad pattern. Please define the enum as alias to byte type and explicitly assign letters.

@@ -179,3 +209,51 @@ func createTraceKV(span *model.Span, encodingType byte, startTime uint64) ([]byt

return key, bb, err
}

// This method is only for testing purpose to test backward compatibility, once dual-lookup is removed, this method can be removed
func (w *SpanWriter) writeSpanWithOldIndex(span *model.Span) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this supposed to be the old way of writing? Why create a new method then instead of having a new method only for NEW way. That way the diff will be smaller and more readable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage changelog:new-feature Change that should be called out as new feature in CHANGELOG enhancement storage/badger Issues related to badger storage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Badger storage plugin: query service to support spanKind when retrieve operations for a given service.
2 participants