-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 correct tracking of producer views #3854
base: master
Are you sure you want to change the base?
Conversation
Quick links (staging server):
Login: chart-diff: ✅No charts for review.data-diff: ✅ No differences foundLegend: +New ~Modified -Removed =Identical Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet Automatically updated datasets matching weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included Edited: 2025-01-16 17:21:06 UTC |
Update: the bug might be coming from Example: df = VersionTracker(exclude_steps=[]).steps_df
df.loc[df.step.str.contains("grapher/gcp/2024-11-21/global_carbon_budget"), "all_chart_slugs"].item() |
On my computer, using regular queries to DB improves performance significantly. Old code (with VersionTracker): Took ~8 seconds. |
Fixes #3855
Producer analytics does not accurately count the views of data from a given producer. It might be over estimating these.
This PR drops the usage of
VersionTracker
for regular queries to our database. There are two benefits: