radioship_transcriber

This is a python package containing the Command Line Interface for the radioship_transcriber.

The transcriber creates transcripts for .mp3 files in .txt format using neural networks created for this purpose. It takes an input folder path, an output folder path, and an optional model path that can be a local path or a url on huggingface.co. If no model path is given, it will use our hungarian model as a default.

Install:

The transcriber is dependent on the huggingsoud package, that uses a specific range of versions of torch. Since older versions of torch are not compatible with the latest versions of python it is important to use python 3.10.10. We recommend using pyenv to install it. Here is a great tutorial for that.

You will also need ffmpeg to process .mp3 files. If you don't have it on your machine:
sudo apt-get install ffmpeg

Then create a new virtual environment with pipenv:
pipenv install --python 3.10.10

Activate the new env with:
pipenv shell

Make sure it really uses python 3.10.10! Then install the radioship_transcriber from this repo:
pipenv install git+https://github.com/Koffair/radioship_transcriber.git#egg=radioship_transcriber

Usage:

Now, if your virtual environment is active, you can call the transcriber the following ways (no need to type python):

to use default model:
radioship_transcriber -i path/to/input/ -o path/to/output/
to use your own or any other model:
radioship_transcriber -i path/to/input/ -o path/to/output/ -m path/to/other/model/

The input folder should contain .mp3 files. The output folder will have .txt files, and the logs in a separate folder. If the specified model is not present, the transcriber will download and cash it for later use. If you want to remove a cashed model, use:
huggingface-cli delete-cache

man:

Create transript for mp3 files. [-h] -i -o [-m]

options:
-h, --help show this help message and exit
-i , --in_path Path to input file or directory
-o , --out_path Path to output directory
-m , --model_path Address to transcripter model

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
dist		dist
radioship_transcriber		radioship_transcriber
tests		tests
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE.md		LICENSE.md
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

radioship_transcriber

Install:

Usage:

man:

About

Releases

Packages

Contributors 3

Languages

License

Koffair/radioship_transcriber

Folders and files

Latest commit

History

Repository files navigation

radioship_transcriber

Install:

Usage:

man:

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages