Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHE1 MDV performance #199

Closed
SamStudio8 opened this issue Mar 6, 2022 · 2 comments
Closed

PHE1 MDV performance #199

SamStudio8 opened this issue Mar 6, 2022 · 2 comments
Assignees
Labels

Comments

@SamStudio8
Copy link
Member

Just like everything else in our systems; the v3 API endpoint for sharing linkage with PHE/UKHSA has steadily grown slower.

The v3 API was written to leverage the Django Rest Framework and deployed an ingenious method to dynamically select fields for serialisation; which was hopefully going to herald a new era of data management. Unfortunately, as previously lamented, the way Majora links artifacts and processes together leads to poor performance for serialising large numbers of objects.

It would be nice if it wasn't slow.

@SamStudio8 SamStudio8 self-assigned this Mar 6, 2022
@SamStudio8
Copy link
Member Author

Commit CLIMB-COVID/majora2@61255f2 hard codes a PHE1-FAST MDV. The definition is strictly followed and a modest test suite tests the basics of the function that replaces the magic DRF dynamic view.

Merely counting the size of the two query sets proves promising as they are at least the same size!

>>> mdv = models.MajoraDataview.objects.get(code_name="PHE1")
>>> queryset = apps.get_model("majora2", mdv.entry_point).objects.all()                                                  
>>> queryset.filter( mdv.get_filters() ).count()                                                                         
2114223
>>> len(mdv_tasks.subtask_get_mdv_v3_phe1_faster())
2114223

I'll liaise with FS to test this out this week. The performance delta is going to blow their socks off.

This isn't so troublesome for the DA1 view as it has not grown at the same rate (and linked samples are additionally removed from DA1, so it does not grow constantly). However, this approach would work to speed up the DA1 view in future if required.

@SamStudio8
Copy link
Member Author

PHE1-FAST appears to complete in around 60 seconds as opposed to 1h+, I will never design something that falls into the trap of that pesky n+1 query problem again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant