identify and prefetch N+1 queries in search/all for learner pathways #4488

hamza-56 · 2024-11-12T15:04:33Z

PROD-3888

TL;DR

This PR addresses the need for optimized query fetching in the /api/v1/search/all/?include_learner_pathways=true endpoint by implementing prefetch_related for LearnerPathway data.

Fixes N+1 issues in LearnerPathwayViewSet
Fixes N+1 issues in LearnerPathwayStepViewSet
Fixes N+1 issues in LearnerPathwayCourseViewSet
Fixed N+1 issues in LearnerPathwayProgramViewSet
Removes unnecessary eager loading of learnerpathwayblock_set

Details

The get_linked_courses_and_course_runs model method is only used in LearnerPathwayProgramSerializer.
The LearnerPathwayProgramSerializer is used by the LearnerPathwayProgramViewSet and the LearnerPathwayStepSerializer, which is then used in the LearnerPathwaySearchDocumentSerializer.
Even if we apply prefetch_related in the LearnerPathwayDocument's get_queryset, the filtering in get_linked_courses_and_course_runs bypasses the prefetched results and runs additional queries.

To solve this issue: I have moved the filtering logic to LearnerPathwayProgramViewSet's get_queryset and LearnerPathwayDocument's``get_queryset method

Since we are already overriding the LearnerPathwayProgramViewSet's get_queryset, I have also fixed the existing N+1 issue in the viewset.

Similarly for courses, LearnerPathwayCourseMinimalSerializer's get_course_runs method was causing additional queries because of filters.
LearnerPathwayCourseMinimalSerializer is used by LearnerPathwayCourseSerializer
which is used by LearnerPathwayCourseViewSet and LearnerPathwayStepSerializer (used in the LearnerPathwaySearchDocumentSerializer)
To solve this issue: I have moved the filtering logic to LearnerPathwayCourseViewSet's get_queryset and LearnerPathwayDocument's get_queryset method
Since we are already overriding the LearnerPathwayCourseViewSet's get_queryset, I have also fixed the existing N+1 issue in the viewset.

DawoudSheraz

commit type should be perf, not chore
Add a relevant unit test to showcase the decrease in query count

DawoudSheraz · 2024-11-15T06:28:15Z

course_discovery/apps/learner_pathway/api/serializers.py

@@ -87,8 +82,7 @@ def get_card_image_url(self, step):
        return program.card_image_url

    def get_courses(self, obj):
-        excluded_restriction_types = get_excluded_restriction_types(self.context['request'])
-        return obj.get_linked_courses_and_course_runs(excluded_restriction_types=excluded_restriction_types)


why are these removed from serializer? This should be independent from document changes.

DawoudSheraz · 2024-11-15T06:28:25Z

course_discovery/apps/learner_pathway/api/v1/urls.py

+router.register(r'learner-pathway-course', views.LearnerPathwayCourseViewSet, basename='learner-pathway-course')
+router.register(r'learner-pathway-program', views.LearnerPathwayProgramViewSet, basename='learner-pathway-program')


why is this needed?

The basename is automatically generated from the queryset attribute of the viewset if it exists. However, in our case, we removed the queryset attribute and added a get_queryset method instead. Therefore, we need to explicitly set the basename to avoid any errors.

DawoudSheraz · 2024-11-15T06:29:11Z

course_discovery/apps/learner_pathway/api/v1/views.py

+            Prefetch(
+                'course__course_runs', 
+                queryset=CourseRun.objects.filter(
+                        status=CourseRunStatus.Published


why are we using Published status as the only check here?

This is the same filter that was being used in get_linked_courses_and_course_runs and get_course_runs https://github.com/openedx/course-discovery/blob/master/course_discovery/apps/learner_pathway/models.py#L313

https://github.com/openedx/course-discovery/blob/master/course_discovery/apps/learner_pathway/api/serializers.py#L25

DawoudSheraz · 2024-11-15T06:29:57Z

course_discovery/apps/learner_pathway/models.py

@@ -299,23 +299,15 @@ def get_skills(self) -> [str]:

        return program_skills

-    def get_linked_courses_and_course_runs(self, excluded_restriction_types=None) -> [dict]:
+    def get_linked_courses_and_course_runs(self):


same question, why is the being removed from here considering this is a model method?

This model method is used exclusively in the LearnerPathwayProgramSerializer.

The LearnerPathwayProgramSerializer is used by the LearnerPathwayProgramViewSet and the LearnerPathwayStepSerializer, which is then used in the LearnerPathwaySearchDocumentSerializer.

In this model method, the use of .filter and .exclude bypasses the prefetch cache, causing additional database queries and leading to an N+1 query problem. To address this, the filtering logic has been moved to the get_queryset method of both LearnerPathwayProgramViewSet and LearnerPathwayDocument.

…hways

zawan-ila · 2024-11-18T19:59:10Z

The changes look good, but are quite dense. I'll have another look at them before approval. Can you please satisfy codecov in the meanwhile? Also, ref

Excludes Learnerpathway course_runs based on excluded_restriction_types in /api/v1/search/all

Does the current implementation not take care of this? I'd expect the filtering at the serializer level (in get_courses and get_course_runs) to have taken care of this. I believe the search/all endpoint uses these serializers too.

hamza-56 · 2024-11-20T00:52:19Z

The changes look good, but are quite dense. I'll have another look at them before approval. Can you please satisfy codecov in the meanwhile? Also, ref

Excludes Learnerpathway course_runs based on excluded_restriction_types in /api/v1/search/all

Does the current implementation not take care of this? I'd expect the filtering at the serializer level (in get_courses and get_course_runs) to have taken care of this. I believe the search/all endpoint uses these serializers too.

@zawan-ila You're correct—the current implementation already handles this because we're using these serializers in /search/all. I've updated the PR description.

course_discovery/apps/learner_pathway/api/v1/tests/test_views.py

zawan-ila

Great work on this 🎉

course_discovery/apps/learner_pathway/api/v1/tests/test_views.py

DawoudSheraz · 2024-11-20T13:38:49Z

course_discovery/apps/api/v1/tests/test_views/test_search.py

+        if include_learner_pathways:
+            expected_result_count = pathways.count()
+            expected_query_count = 8
+        else:
+            expected_result_count = 0
+            expected_query_count = 4
+


nit: instead of if-else, you can move the counts to ddt as well. As for pathways.count(), it would be better to have static explicit values instead of comparing against DB count.

Updated ✅

DawoudSheraz · 2024-11-20T13:40:19Z

course_discovery/apps/learner_pathway/api/v1/tests/test_views.py

@@ -245,3 +266,283 @@ def test_learner_pathway_uuids_endpoint(self, query_params, response):
        learner_pathway_uuids_url = f'/api/v1/learner-pathway/uuids/?{urlencode(query_params)}'
        api_response = self.client.get(learner_pathway_uuids_url)
        assert api_response.json() == response
+
+
+@mark.django_db


nit: wondering if this is still needed as the test suite is using Django's TestCase, not unittest TestCase

Updated ✅

DawoudSheraz · 2024-11-20T13:56:17Z

course_discovery/apps/learner_pathway/models.py

-                ).values('key')
-            )
-            courses.append({"key": course.key, "course_runs": course_runs})
+            course_runs = [{'key': course_run.key} for course_run in course.course_runs.all()]


nit: why can't we use .values() here like before?

.values() causes additional queries

hamza-56 self-assigned this Nov 12, 2024

hamza-56 force-pushed the hamza/PROD-3888 branch from 7e0914f to 9922498 Compare November 12, 2024 15:06

hamza-56 marked this pull request as ready for review November 12, 2024 15:06

hamza-56 requested review from AfaqShuaib09, zawan-ila, DawoudSheraz and Ali-D-Akbar and removed request for AfaqShuaib09 November 12, 2024 15:06

Ali-D-Akbar approved these changes Nov 12, 2024

View reviewed changes

DawoudSheraz reviewed Nov 13, 2024

View reviewed changes

hamza-56 force-pushed the hamza/PROD-3888 branch 2 times, most recently from 0fb9ea2 to 16f9714 Compare November 15, 2024 00:56

hamza-56 requested review from DawoudSheraz and Ali-D-Akbar November 15, 2024 01:01

hamza-56 force-pushed the hamza/PROD-3888 branch from 16f9714 to 3a0aab8 Compare November 15, 2024 01:04

DawoudSheraz reviewed Nov 15, 2024

View reviewed changes

hamza-56 force-pushed the hamza/PROD-3888 branch from 3a0aab8 to 12c1ed1 Compare November 15, 2024 08:44

perf: identify and prefetch N+1 queries in search/all for learner pat…

79af586

…hways

hamza-56 force-pushed the hamza/PROD-3888 branch from 12c1ed1 to 79af586 Compare November 15, 2024 09:22

fix: learner pathway restricted runs test case

e7c6ba5

hamza-56 requested a review from DawoudSheraz November 18, 2024 01:17

chore: add filters and fix N+1 issues in LearnerPathwayStepViewSet

e2c4c6e

hamza-56 force-pushed the hamza/PROD-3888 branch 2 times, most recently from 520cdad to a6b9a63 Compare November 20, 2024 01:39

hamza-56 force-pushed the hamza/PROD-3888 branch from a6b9a63 to 2004662 Compare November 20, 2024 01:45

zawan-ila reviewed Nov 20, 2024

View reviewed changes

course_discovery/apps/learner_pathway/api/v1/tests/test_views.py Show resolved Hide resolved

zawan-ila approved these changes Nov 20, 2024

View reviewed changes

course_discovery/apps/learner_pathway/api/v1/tests/test_views.py Outdated Show resolved Hide resolved

course_discovery/apps/learner_pathway/api/v1/tests/test_views.py Outdated Show resolved Hide resolved

hamza-56 force-pushed the hamza/PROD-3888 branch from 2004662 to 7f67316 Compare November 20, 2024 12:12

DawoudSheraz reviewed Nov 20, 2024

View reviewed changes

hamza-56 requested a review from DawoudSheraz November 20, 2024 20:04

test: add tests for updated viewsets

e2fcaa7

hamza-56 force-pushed the hamza/PROD-3888 branch from 7f67316 to e2fcaa7 Compare November 20, 2024 20:05

DawoudSheraz approved these changes Nov 21, 2024

View reviewed changes

Merge branch 'master' into hamza/PROD-3888

ec85695

hamza-56 merged commit 0bf2139 into master Nov 21, 2024
14 checks passed

hamza-56 deleted the hamza/PROD-3888 branch November 21, 2024 12:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

identify and prefetch N+1 queries in search/all for learner pathways #4488

identify and prefetch N+1 queries in search/all for learner pathways #4488

hamza-56 commented Nov 12, 2024 •

edited

Loading

DawoudSheraz left a comment

DawoudSheraz Nov 15, 2024

DawoudSheraz Nov 15, 2024

hamza-56 Nov 15, 2024

DawoudSheraz Nov 15, 2024

hamza-56 Nov 15, 2024 •

edited

Loading

DawoudSheraz Nov 15, 2024

hamza-56 Nov 15, 2024

zawan-ila commented Nov 18, 2024

hamza-56 commented Nov 20, 2024

zawan-ila left a comment

DawoudSheraz Nov 20, 2024

hamza-56 Nov 20, 2024

DawoudSheraz Nov 20, 2024

hamza-56 Nov 20, 2024

DawoudSheraz Nov 20, 2024

hamza-56 Nov 20, 2024

		router.register(r'learner-pathway-course', views.LearnerPathwayCourseViewSet, basename='learner-pathway-course')
		router.register(r'learner-pathway-program', views.LearnerPathwayProgramViewSet, basename='learner-pathway-program')

identify and prefetch N+1 queries in search/all for learner pathways #4488

identify and prefetch N+1 queries in search/all for learner pathways #4488

Conversation

hamza-56 commented Nov 12, 2024 • edited Loading

TL;DR

Details

DawoudSheraz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hamza-56 Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zawan-ila commented Nov 18, 2024

hamza-56 commented Nov 20, 2024

zawan-ila left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hamza-56 commented Nov 12, 2024 •

edited

Loading

hamza-56 Nov 15, 2024 •

edited

Loading