Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: initial implementation [BB-7595] #3

Open
wants to merge 46 commits into
base: main
Choose a base branch
from

Conversation

Agrendalath
Copy link
Member

@Agrendalath Agrendalath commented Nov 14, 2023

Merge checklist:
Check off if complete or not applicable:

  • Version bumped
  • Changelog record added
  • Documentation updated (not only docstrings)
  • Fixup commits are squashed away
  • Unit tests added/updated
  • Manual testing instructions provided
  • Noted any: Concerns, dependencies, migration issues, deadlines, tickets

Testing instructions

  1. Use the master branch of the edx-platform.
  2. Clone and install this package. Run migrations.
  3. Upload the assets attached to the Jira ticket (private).
  4. Create two certificate types:
    1. Completion:
      1. Retrieval func: retrieve_course_completions
      2. Generation func: generate_pdf_certificate
      3. Custom options:
         {
             "name_y": 290,
             "course_name_y": 210,
             "template": "template_completion",
             "template_two_lines": "template_completion_two_line",
             "required_completion": 0.9
         }
    2. Achievement:
      1. Retrieval func: retrieve_subsection_grades
      2. Generation func: generate_pdf_certificate
      3. Custom options:
         {
             "name_y": 290,
             "course_name_y": 220,
             "template": "template_achievement",
             "required_grades": {
                 "Homework": 0.4,
                 "Exam": 0.9,
                 "Total": 0.8
             }
         }
  5. Create two course certificate configurations:
    1. Completion - you can use course-v1:edX+DemoX+Demo_Course as the course ID and override the required_completion to make the testing easier.
    2. Achievement - you can use course-v1:edX+DemoX+Demo_Course as the course ID and override the required_grades to make the testing easier.
  6. Press the "generate certificates" button for each of the created course configurations.
  7. Check that the certificates were generated correctly.

Notes

  1. The docs are deployed to https://openedx-certificates.readthedocs.io.
  2. The pluggability of processors and generators will be included in the next iteration.

Private-ref: BB-7595

@Agrendalath Agrendalath self-assigned this Nov 14, 2023
@Agrendalath Agrendalath force-pushed the agrendalath/bb-7595-initial_implementation branch 3 times, most recently from 420c6a6 to 8fce528 Compare November 14, 2023 13:56
@Agrendalath Agrendalath force-pushed the agrendalath/bb-7595-initial_implementation branch from b832ed7 to 3b8071d Compare November 14, 2023 14:32
@Agrendalath Agrendalath marked this pull request as ready for review November 14, 2023 14:32
@Agrendalath Agrendalath force-pushed the agrendalath/bb-7595-initial_implementation branch from c9fb3e9 to 68631e1 Compare November 14, 2023 15:55
@Agrendalath Agrendalath force-pushed the agrendalath/bb-7595-initial_implementation branch from 50ed981 to 127df64 Compare November 15, 2023 16:34
Copy link
Member

@bradenmacdonald bradenmacdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a lot of feedback but I do think this looks nice! Feel free to push back on anything; a lot of this is trying to understand and push for clear documentation.


|pypi-badge| |ci-badge| |codecov-badge| |doc-badge| |pyversions-badge|
|license-badge| |status-badge|
|license-badge| |status-badge| |visualization-badge|

Purpose
*******

A pluggable service for preparing Open edX certificates.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we expand the README a bit? To explain what this is and why it's useful. You could put a short summary and link to the ADR if you want. Linking to the docs isn't helpful since they're empty.

If this is focused on the HMS use cases for now, we can say something like that - "This is focused on the use cases of HMS but designed to be a flexible foundation for any open edx certificate needs."

Can we also state how this compares to / differs from the credentials service ? Again you can link to the ADR.

What kind of certificates does this produce? PDF, HTML, badge, or "anything you implement as a plugin" ?

docs/decisions/0001-purpose-of-this-repo.rst Show resolved Hide resolved
docs/decisions/0001-purpose-of-this-repo.rst Show resolved Hide resolved
#. :ref:`Lesson <lesson>`.
#. :ref:`Pathway <pathway>`.

#. This Django app uses `celerybeat`_ to periodically retrieve data from the external API. Retrieving this data is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what "external API"? ADR 0001 says this will use public Python APIs, which I would call "internal" (to Open edX). Is this interacting with SalesForce or something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you add an explanation of why we're using cron instead of hooks ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on this. The hooks and filters question has been on my mind as well. As the openedx-filters tutorial, titled “How to use Open edX Filters?” is about using a custom pipeline to generate certificates, it will be one of the first questions from the community.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ADR 001 states,

We do not need to pull data in real time. This service can retrieve data periodically, but the frequency should be configurable per course.

While real time is not needed, what would be the impact of doing it in 'near real-time' using a combination of Hooks and Celery tasks (without schedule), that we are avoiding here?

CertificateType [label="ExternalCertificateType"]
CourseConfiguration [label="ExternalCertificateCourseConfiguration"]
Certificate [label="ExternalCertificate"]
Asset [label="ExternalCertificateAsset"]
Copy link
Member

@bradenmacdonald bradenmacdonald Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these things all called "External"? Since the whole point of this repo seems to be that it plugs in to run "within" Open edX as a django app plugin, and it generates the certs within the LMS app, I would personally not think of them as External. And if they can be like "badges" I wouldn't think of them as "Certificates" either... A name like "FlexCredential" or "CertifiedCredential" or something seems more descriptive. Or you can name this repo something less ambiguous than "openedx-certficicates", for example "Aceso" then it could be "AcesoCredential", etc.

What I would consider an "External" certificate would be one issued by a third party like Accredible or Credly, which would be hosted on an entirely different domain outside of our control (it would be good to make sure our architecture supports those as an option, even if we don't need them now).

"""
This module contains processors for certificate criteria.

The functions prefixed with `retrieve_` are automatically detected by the admin page and are used to retrieve the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by the admin page

I would have thought it's a cron job calling these, not manual trigger from the admin page?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The admin page seems to just list the functions in the choice dropdown, which once saved in the model, is then used elsewhere to actually get the list of user IDs.

The doc string here is a little confusing. It does make the actions in 2 places happen in the same place.



def get_course_grade_factory(): # noqa: ANN201
"""Get the course grade factory from Open edX."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mention what this factory does, and how it interacts with prefetch_...? I thought maybe it was for mocking grades during testing, but it seems that's not the case.

The functions prefixed with `retrieve_` are automatically detected by the admin page and are used to retrieve the
IDs of the users that meet the criteria for the certificate type.

We will move this module to an external repository (a plugin).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest just putting this into a plugins/completion/processor.py file in the same repo, along with the corresponding generators (for "Certificate of completion" & "Badge of achievement/completion")

@@ -4,9 +4,6 @@

-r test.txt # Core and testing dependencies for this package

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Personally I don't think it's better to have so many different requirements files. Just three - base, dev, and constraints - would cover every use case, even if dev will sometimes install more than is needed for a specific use case like quality or ci. But no need to change anything, just feel like mentioning it.

"""
Package metadata for openedx_certificates.
"""
"""Package metadata for openedx_certificates."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need setup.py? Can't we declare all this using pyproject.toml which is the replacement, and which is more efficient since it doesn't have to be run as code before it can be used?

user = get_user_model().objects.get(id=user_id)
# Use the name from the profile if it is not empty. Otherwise, use the first and last name.
# We check if the profile exists because it is absent in unit tests.
user_full_name = getattr(getattr(user, 'profile', None), 'name', f"{user.first_name} {user.last_name}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Wouldn't it be better to mock user.profile in the tests than to adapt code for the tests?

try:
module_path, func_name = func_path.rsplit('.', 1)
module = import_module(module_path)
getattr(module, func_name) # Will raise AttributeError if the function does not exist.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

"""
This module contains processors for certificate criteria.

The functions prefixed with `retrieve_` are automatically detected by the admin page and are used to retrieve the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The admin page seems to just list the functions in the choice dropdown, which once saved in the model, is then used elsewhere to actually get the list of user IDs.

The doc string here is a little confusing. It does make the actions in 2 places happen in the same place.

}

It means that the user must receive at least 40% in the Homework category and 90% in the Exam category.
The "Total" key is a special value used to specify the minimum required grade for all categories in the course.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the "Total"

specify the minimum required grade for all categories in the course.

or specifies the minimum required grade of the final grade, calculated using the weighted total of all other grades?


users = get_course_enrollments(course_id)
grades = _get_grades_by_format(course_id, users)
log.debug(grades)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Include a description in the log.debug statements like log.debug("Grades: %s", grades) to get a better idea of what we are seeing in the logs?

return font or 'Helvetica'


def _write_text_on_template(template: any, font: str, username: str, course_name: str, options: dict[str, Any]) -> any:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _write_text_on_template(template: any, font: str, username: str, course_name: str, options: dict[str, Any]) -> any:
def _write_course_and_user_on_template(template: any, font: str, username: str, course_name: str, options: dict[str, Any]) -> any:

Make the function name more specific and leave the text_on_template free for a more generic func(template, text, location,...) function that may be introduced later?

"""
log.info("Starting certificate generation for user %s", user.id)
# Get template from the ExternalCertificateAsset.
template_file = ExternalCertificateAsset.get_asset_by_slug(options['template'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a no. of functions with the last argument options proxy-ing kwargs. Not an issue in itself. But makes it difficult to understand the function's requirements in cases like this where options['template'] seems to be a required parameter that should be an independent argument.

There is also the case of me (and similar devs) being lazy. I tend to think the last param with dicts is probably optional (I know options doesn't mean optional, still you know) and the function most likely has default values defined, and it will just work with an empty dict. _write_text_on_template would work under this assumption, but not this function.

I understand, since this is a first iteration and this function will be pushed into a plugin, it is not a big deal now. Just thought I should mention this here, so it helps when defining this function's signature, that the plugins should adhere to.

#. :ref:`Lesson <lesson>`.
#. :ref:`Pathway <pathway>`.

#. This Django app uses `celerybeat`_ to periodically retrieve data from the external API. Retrieving this data is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on this. The hooks and filters question has been on my mind as well. As the openedx-filters tutorial, titled “How to use Open edX Filters?” is about using a custom pipeline to generate certificates, it will be one of the first questions from the community.

#. :ref:`Lesson <lesson>`.
#. :ref:`Pathway <pathway>`.

#. This Django app uses `celerybeat`_ to periodically retrieve data from the external API. Retrieving this data is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ADR 001 states,

We do not need to pull data in real time. This service can retrieve data periodically, but the frequency should be configurable per course.

While real time is not needed, what would be the impact of doing it in 'near real-time' using a combination of Hooks and Celery tasks (without schedule), that we are avoiding here?

docs/decisions/0002-architecture.rst Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants