Archival Status

This repository is archived and will not receive future updates or support.

As of February, 2022 this software is still working assuming you install the correct dependencies.

All code in this repository is made available under the MIT license, so please feel free to fork / copy / modify it for your own use according to the license terms. Thank you to the contributors who have helped improve glossika-to-anki over the years.

The original README follows below.

glossika-to-anki

Generate Anki decks from Glossika PDFs and audio files

glossika-to-anki is a set of Python 3 scripts to generate Anki flashcards using the PDFs and audio from the Glossika language program.

Features

glossika-to-anki provides three main utilities:

glossika_extract_pdf.py - Generate a TSV file of English and target language phrases from Glossika PDFs
glossika_split_audio.py - Split GMS-C audio files into individual mp3s for each phrase
generate_anki.py - Create an Anki deck by combining each phase and its corresponding audio into a separate Anki note / card

Requirements & Setup

Python 3
- On Windows you can run the installer from the Python website.
- MacOS or Linux you can install python with brew install python or apt install python.
pdftotext - Converts the Glossika PDFs to text so that the phrases can be extracted with regex.
- On Windows download Xpdf tools and copy pdftotext.exe to a folder on the path (i.e. the Python folder). If you installed python with the Windows installer, the default path should be C:\Program Files or C:\Users\your_name\AppData\Local. Alternatively, you might also be able to run which python or where python from cmd prompt to figure out where the python executable is located.
- On MacOS brew cask install pdftotext; on Linux apt install poppler-utils
mp3splt - Splits the GMS-C files into individual files on the silence between sentences.
- On Windows download mp3splt from the project homepage and add it to the path
- On MacOS brew install mp3splt; on Linux apt install mp3splt
genanki to generate Anki decks
- pip install genanki or pip3 install genanki
v2 Glossika PDFs and GMS-C mp3 files
- The PDF script only works with the v2 Glossika PDFs. These files are searchable, and have a blue box around each phrase. If your PDFs are more than 20 mb you have the older version that is not supported.
- Here's an example of what the audio file names should look like: ENZS-F1-GMS-C-0001.mp3. Note that the ENZS prefix is for Mandarin and varies by language.
- Note: The Glossika PDFs and audio files are difficult to find since Glossika recently discontinued them and launched a subscription service. I heard some people have had luck contacting Glossika's support team to purchase the older PDF courses, but your milage may vary.

Running

Clone or download the repository.

git clone [email protected]:emesterhazy/glossika-to-anki.git
cd glossika-to-anki/glossika-to-anki

Run each script and follow the prompts to copy the Glossika files into the source folder that is created.

python glossika_extract_pdf.py
python glossika_split_audio.py
python generate_anki.py

Import your new Anki deck!

Limitations

Only supports the v2 Glossika PDFs, not the older non-searchable PDFs. PDFs with copy protection must have it removed before sentences can be extracted.
No support for extracting IPA

Contributing

Pull requests are welcome. If you would like to add support for the v1 Glossika PDFs or make changes that require new dependencies please open an issue first to discuss.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
glossika-to-anki		glossika-to-anki
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Archival Status

glossika-to-anki

Features

Requirements & Setup

Running

Limitations

Contributing

About

Releases

Packages

Contributors 5

Languages

License

emesterhazy/glossika-to-anki

Folders and files

Latest commit

History

Repository files navigation

Archival Status

glossika-to-anki

Features

Requirements & Setup

Running

Limitations

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages