Hosted on github.com/Blink29
Why parse a repository?
- To collect all files of a particular repository, flattened and grouped by extension, is too menial to be done manually.
- Grouped files can be used for further analysis / use / testing, based on requirements.
How does the parser work?
- The parser is developed using the GitHub API.
- Recursive level-wise file parsing is done to obtain absolute path of each file in the repository.
- Simultaneously, files of each type are grouped together and a dictionary of type { extension: array_of_files } is returned.
pip install github-repo-files-parser
from github_repo_files_parser import GitHubRepoFilesParser
parser = GitHubRepoFilesParser()
repo_url = "https://github.com/Blink29/github_repo_files_parser"
parser.get_raw_repo_links(repo_url)
{
"py": [
"https://github.com/Blink29/github_repo_files_parser/blob/main/index.py",
"https://github.com/Blink29/github_repo_files_parser/blob/main/setup.py",
"https://github.com/Blink29/github_repo_files_parser/blob/main/github_repo_files_parser/__init__.py",
"https://github.com/Blink29/github_repo_files_parser/blob/main/github_repo_files_parser/github_repo_files_parser.py"
],
"md": [
"https://github.com/Blink29/github_repo_files_parser/blob/main/README.md"
],
"gitignore": [
"https://github.com/Blink29/github_repo_files_parser/blob/main/.gitignore"
],
"directories": [
"https://github.com/Blink29/github_repo_files_parser/tree/main/github_repo_files_parser"
],
"cfg": [
"https://github.com/Blink29/github_repo_files_parser/blob/main/github_repo_files_parser/setup.cfg"
]
}