WebScraping_Assignment_2021

Webscraping using python and selenium

Moodle_Login

This folder consists of a python file moodleLogin.py that takes arguments from the terminal as:
python .\moodleLogin.py username password
You are automatically logged into moodle after running the python script thus skipping those irritaing captchas!

Codeforces

This folder consists of 3 files: 1 for fetching problems of a particular contest and other 2 for bonus tasks. Each of the python scripts run in headless mode.

fetch_round.py:

This python script takes contest number as argument along with the file name at the time of running in the terminal:
python .\fetch_round.py contest_number
The problems will be downloaded in a hierarchical folder structure, with the contents of a problem kept in the directory ./<contest_number>/<problem_label>/. inside the same directory as the python file. The contents of a problem include photo of the problem statement along with inputs and outputs (present in the problem on codeforces) as text files.

pastX.py:

This python script takes no. of past contents to be scraped as argument along with the file name at the time of running in the terminal:
python .\pastX.py no_of_past_contests
pastX.py finds the past x contests given as argument and accesses the terminal to launch fetch_round.py for each of the contests obtained. The problems will be downloaded in a hierarchical folder structure just likh fetch_round, the difference being, there are now multiple directories for multiple contests. If the past contests given as argument exceeds the past contests given on a single page, the python script navigates to the next page thus making extensive use of page navigation too!

difficulty_range.py

This python script takes difficulty range setters(start and end), no. of problems as arguments along with the file name at the time of running in the terminal:
python .\difficulty_range.py starting_difficulty_lvl ending_difficulty_lvl no_of_problems_to_be_scraped
The problems between the specified difficulty range will be downloaded in a hierarchical folder structure, with the contents of a problem kept in the directory ./<Difficulty_Range_start-end>/<contest_name+problem_label>/. The number of problems to be scraped will depend on the argument in the terminal. There are generally 100 problems on a page after filtering, the python file also takes care of this and makes use of page navigation to get the required no. of problems.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Codeforces		Codeforces
Moodle_Login		Moodle_Login
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WebScraping_Assignment_2021

Moodle_Login

Codeforces

fetch_round.py:

pastX.py:

difficulty_range.py

About

Releases

Packages

Languages

VanshKachhwal/WebScraping_Assignment_2021

Folders and files

Latest commit

History

Repository files navigation

WebScraping_Assignment_2021

Moodle_Login

Codeforces

fetch_round.py:

pastX.py:

difficulty_range.py

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages