Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add freecodecamp scraper #815

Merged
merged 5 commits into from
Aug 31, 2023
Merged

Add freecodecamp scraper #815

merged 5 commits into from
Aug 31, 2023

Conversation

benoit74
Copy link
Collaborator

@benoit74 benoit74 commented Aug 29, 2023

Rationale

Changes

  • dispatcher/backend: freecodecamp has been added to the ScheduleCategory and Offliner enums
  • dispatcher/backend: DockerImageName for freecodecamp has been set to openzim/freecodecamp
  • dispatcher/backend: freecodecamp flags schema has been defined
  • dispatcher/backend: freecodecamp output location has been defined
  • dispatcher/frontend-ui: freecodecamp has been added to categories, warehouse_paths and offliners in src/contstants.js
  • openAPI specification has been adapted to reflect the fact that POST /tasks/{task_id} operation needs a worker_name set as query parameter

Tests

  • a recipe for freecodecamp has been manually created successfully locally, flags have been set ; the recipe has not been requested / ran by any worker locally
  • openAPI spec adaptation has been manually tested locally with success
  • an automated test for freecodecamp schedule creation + flags modifications has been created

@benoit74 benoit74 self-assigned this Aug 29, 2023
@benoit74 benoit74 mentioned this pull request Aug 29, 2023
18 tasks
@benoit74 benoit74 force-pushed the add_fcc branch 2 times, most recently from f904ef4 to 74445e3 Compare August 29, 2023 15:18
@benoit74 benoit74 requested a review from rgaudin August 29, 2023 15:20
Copy link
Member

@rgaudin rgaudin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good ; thanks for the test, it's nice to have.
I am requesting changes mostly because of the category/warehouse_path that @kelson42 must answer first.
If you can make a minor release about output param, that would be nice as this is a new scraper.

dispatcher/backend/src/common/enum.py Show resolved Hide resolved
dispatcher/backend/src/utils/offliners.py Outdated Show resolved Hide resolved
dispatcher/frontend-ui/src/constants.js Show resolved Hide resolved
@kelson42
Copy link
Contributor

kelson42 commented Aug 31, 2023

I am requesting changes mostly because of the category/warehouse_path that @kelson42 must answer first.

I guess I missed the question, sorry. But if the question is: do we need a dedicated (zimfarm) category, IMHO yes.

@benoit74
Copy link
Collaborator Author

@rgaudin Good catch on output folder and languages list!

I'm a bit puzzled about output flag, I see no reason to keep it in the list of scraper flags and force it to a '/output' value since it is then setup properly at

flags[
offliner_def.std_output
if isinstance(offliner_def.std_output, str)
else "output"
] = str(mount_point)
but other scrapers have this flag in their schema. Let discuss it live maybe, it will be easier.

I confirmed via slack with Kelson that we have to create both a new category and a new folder, because he expects us to quickly create many FCC ZIMs.

@benoit74 benoit74 requested a review from rgaudin August 31, 2023 07:39
Copy link
Member

@rgaudin rgaudin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK then ; I'll walk you through creating those folders

@benoit74
Copy link
Collaborator Author

There is a reason why the output flag is kept in each scraper flags ; it is "mandatory" to avoid the UI / API having mixed feeling about this weird flag. I reverted the change.

@benoit74 benoit74 merged commit c4dfd2c into main Aug 31, 2023
5 checks passed
@benoit74 benoit74 deleted the add_fcc branch August 31, 2023 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add freecodecamp support
3 participants