Staleness batch #6355

mattdurham · 2024-02-12T16:09:09Z

PR Description

This moves the staleness tracker to use batch operation. Using the two fanout options should ensure that tracking is always called.

Notes to the Reviewer

PR Checklist

CHANGELOG.md updated
Tests updated

thampiotr · 2024-02-14T09:42:08Z

service/labelstore/service_test.go

Can we write some benchmarks and share the results?

Sure, added a benchmark. TrackStaleness with concurrent calls averages around 2ms per handling 100k entries.

Backporting the benchmark to main takes 14ms per 100k entries. So 7x better.

thampiotr

Neat idea to batch it! Would really like to see some benchmarks so we can know for sure how much better this is :)

thampiotr · 2024-02-14T09:57:31Z

component/prometheus/fanout.go

-		// Tested this to ensure it had no cpu impact, since it is called so often.
-		a.ls.RemoveStaleMarker(uint64(ref))
-	}
+	a.stalenessTrackers = append(a.stalenessTrackers, labelstore.StalenessTracker{


I may be missing some context here, but don't we need similar staleness tracking for AppendExemplar and AppendHistogram?

We do but it never had it so keeping the pr small. I will have 2-3 more prs incoming.

service/labelstore/service.go

thampiotr · 2024-02-14T10:14:32Z

component/prometheus/interceptor.go

-		// Tested this to ensure it had no cpu impact, since it is called so often.
-		a.ls.RemoveStaleMarker(uint64(ref))
-	}
+	a.stalenessTrackers = append(a.stalenessTrackers, labelstore.StalenessTracker{


We create a labelstore.StalenessTracker here, but later in func (s *service) TrackStaleness(ids []StalenessTracker) we convert these to &staleMarker{} and calculate the labels hash. Could we instead create the &staleMarker{} type right away here and calculate the hash here?

That way there will be less work and structs to create, the code will get simpler too I think.

I feel those are owned by two different things. Mainly the last marked state should really only be set by the labelstore itself and exposing that field feels off. This gets cleaned up slightly in the next PR.

thampiotr · 2024-02-14T10:17:56Z

service/labelstore/service.go

-		globalID:        globalRefID,
+	for _, id := range ids {
+		if value.IsStaleNaN(id.Value) {
+			s.staleGlobals[id.GlobalRefID] = &staleMarker{


Here we have a map of pointers, but in the fanout we use a slice of structs stalenessTrackers []labelstore.StalenessTracker. We also use slice of structs for labels. So it's a bit inconsistent and I'm not sure what's the thinking behind it. Do we have benchmarks for what performs better? Did this come up in allocation profiles somewhere?

You dont have the traverse the map to check if you need to add or update the value like you would an array. Now its unlikely you would need to add staleness markers mutliple times, but I would rather not assume that.

thampiotr · 2024-02-14T10:20:09Z

service/labelstore/service.go

 	s.mut.Lock()
 	defer s.mut.Unlock()


Just a thought: could we use a different mutex for s.staleGlobals and thus avoid contention with other methods like GetLocalRefID, GetGlobalRefID, GetOrAddGlobalRefID, etc? seems possible at a glance

I fiddled with that a bit but it felt a bit more error prone in developing. I would prefer to split that out to another pr if we wanted to go that route.

…batch

* Move the staleness tracking to commit and rollback in a batch. * Move the staleness tracking to commit and rollback in a batch. * add more specific comment * fix linting * PR feedback

mattdurham added 5 commits February 12, 2024 11:07

Move the staleness tracking to commit and rollback in a batch.

3f39be0

Move the staleness tracking to commit and rollback in a batch.

b437d52

add more specific comment

31200ee

fix linting

00e2219

Merge branch 'main' into staleness_batch

3b312ce

mattdurham assigned thampiotr Feb 13, 2024

mattdurham marked this pull request as ready for review February 13, 2024 13:55

mattdurham assigned erikbaranowski Feb 13, 2024

thampiotr reviewed Feb 14, 2024

View reviewed changes

mattdurham added 5 commits February 14, 2024 09:20

PR feedback

cc74807

Merge remote-tracking branch 'origin/staleness_batch' into staleness_…

a4fe8d2

…batch

Merge branch 'main' into staleness_batch

ecdc21f

Merge branch 'main' into staleness_batch

364f9c0

Merge branch 'main' into staleness_batch

0ab9b67

mattdurham requested a review from thampiotr February 16, 2024 14:22

thampiotr approved these changes Feb 16, 2024

View reviewed changes

mattdurham merged commit c57cb77 into main Feb 16, 2024
10 checks passed

mattdurham deleted the staleness_batch branch February 16, 2024 14:31

github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label Mar 18, 2024

github-actions bot locked as resolved and limited conversation to collaborators Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Staleness batch #6355

Staleness batch #6355

mattdurham commented Feb 12, 2024

thampiotr Feb 14, 2024

mattdurham Feb 14, 2024 •

edited

Loading

thampiotr left a comment

thampiotr Feb 14, 2024

mattdurham Feb 14, 2024

thampiotr Feb 14, 2024

mattdurham Feb 14, 2024

thampiotr Feb 14, 2024

mattdurham Feb 14, 2024

thampiotr Feb 14, 2024

mattdurham Feb 14, 2024

Staleness batch #6355

Staleness batch #6355

Conversation

mattdurham commented Feb 12, 2024

PR Description

Notes to the Reviewer

PR Checklist

Choose a reason for hiding this comment

mattdurham Feb 14, 2024 • edited Loading

Choose a reason for hiding this comment

thampiotr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattdurham Feb 14, 2024 •

edited

Loading