A Speech Service using AI with current models like Whisper and NLLB.
The tests are performed in a Docker container that also works in the Windows Subsystem for Linux (WSL). An NVIDIA graphics card with at least 4 GB VRAM is recommended, depending on the models used. CUDA is part of the Docker image, only the NVIDIA graphics driver needs to be installed.
Docker must have CUDA enabled (e.g. for WSL see https://docs.nvidia.com/cuda/wsl-user-guide/index.html).
-
Clone https://github.com/andrePankraz/speech_service
$ export DOCKER_BUILDKIT=1 $ docker compose up
- Will take some time at first start (images & packages are downloaded, >10 GB)
- Wait & check if up and running
-
Go to URL: http://localhost:8200/
- Will take some time at first start (models are downloaded, several GB)
-
Clone https://github.com/andrePankraz/speech_service
$ export DOCKER_BUILDKIT=1 $ docker compose --env-file docker/.envs/dev.env up
- Will take some time at first start (images & packages are downloaded, >10 GB)
- Wait & check if up and running
-
Install VS Code
- Install Extension
- Dev Containers
- Docker
- Markdown All in One
- Install Extension
-
Attach VS Code to Docker Container
- Attach to running containers... (Lower left edge in VS Code)
- select speech_service-python-1
- Explorer Open folder -> /opt/speech_service
- Run / Start Debug
- VS Code Extension Python will be installed the first time (Wait and another Start Debug)
- Select Python Interpreter
- Attach to running containers... (Lower left edge in VS Code)
-
Go to URL: http://localhost:8200/
- Will take some time at first start (models are downloaded, several GB)