speaker-diarization

Speaker diarization is the process of partitioning an audio stream containing human speech into homogeneous segments according to the identity of each speaker.

this project includes API, web UI and telegram bot. API module provides three endpoint with deferent response type. all three endpoints get an audio file and use speaker diarization model to process on audio. API results are rttm, TF plot and combined by ASR results. web ui provides an interface to upload your audio file or record a voice for speaker diarization. telegram bot is an useful and simple choice to use speaker diarization. you can record voice or forward voices from your chats to convert it to text aside its speaker tag.

Installation

to install dependecies:

make install

Use

to run all services:

make run

to run specific service:

# server
make runserver

# celery
make runcelery:

# telegram bot
make runtelbot

# gradio
make rungradio

swagger

to see the available APIs go to swagger

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
api		api
bots/telegram		bots/telegram
client/gradio		client/gradio
test		test
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

speaker-diarization

Installation

Use

swagger

About

Releases

Packages

Languages

Sharif-SLPL/speaker-diarization-2024

Folders and files

Latest commit

History

Repository files navigation

speaker-diarization

Installation

Use

swagger

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages