Server side

load the model to GPU
listen to port 8001 for incoming audio file request
- endpoint = "http://127.0.0.1:8001/transcribe"
transcribe and translate the audio into Englisch text.

Server side installation

# install pytortch with cuda 11.8
pip3 install 

# install requirements
pip3 install scipy "numpy<2" fastapi[standard] transformers

# other requirements through sudo apt-get install
# TBD, just install anything that is missing

Then, download the model from the Huggingface model hub and save it somewhere here.

Just save it in this BBAudioTranscription folder
Be sure to download the whisper-medium model

Server side running

The server side can be deployed in a local machine or in WS.

With a local PC

# just run the python script audio_api.py
python3 audio_api.py

With WS

# In WS, for whisper-medium, fast-api listens port 8001, start a worker
rlaunch -L 8001 --cpu=1 --gpu=1 --memory=8192 -- python3 audio_api.py    

# In local PC, start SSH forwarding
ssh -NL 8001:localhost:8001 <WS-username>@<WS-IP>

Client side

start recording with "v"
stop recording with "v" again
send the audio file to the server
wait for the response, which is the string of the transcribed text

Client side installation

pip3 install sounddevice pynput pydub scipy "numpy<2"

Client side running

python3 audio_client.py

BB

HTTP 422 UNPROCESSABLE CONTENT
- key error (fastapi app arg)
Noise in the recording
- High-pass filter with a cut-off frequency of 40Hz

whisper

sampling rate need to be 16000

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
BBAudioTranscriber		BBAudioTranscriber
outputs/2024-08-22/11-49-10/.hydra		outputs/2024-08-22/11-49-10/.hydra
.gitignore		.gitignore
README.md		README.md
recording.wav		recording.wav
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Server side

Server side installation

Server side running

With a local PC

With WS

Client side

Client side installation

Client side running

BB

whisper

About

Releases

Packages

Languages

ProNeverFake/BBAudioTranscriber

Folders and files

Latest commit

History

Repository files navigation

Server side

Server side installation

Server side running

With a local PC

With WS

Client side

Client side installation

Client side running

BB

whisper

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages