Real-Time Speech Recognition

PoC's for speech recognition and speaker diarization.

Working PoC's

rtsr_en.py: PoC using AssemblyAI WebSocket API (english only)
rtsr_de.py: PoC using OpenAI Whisper (de, probably multilingual)

Prototypes

Additionally, a handful of prototypes were created using various technologies:

librosa
NVIDIA NeMo
Tensorflow + Keras Model
Mel Spectrogram CNN

Credits

davabase