A Deepgram client for Dart and Flutter, supporting all Speech-to-Text and Text-to-Speech features on every platform.
You need something else ? Feel free to create issues, contribute to this project or to ask for new features on GitHub !
Speech-to-Text | Status | Methods |
---|---|---|
From File | ✅ | listen.file() , listen.path() |
From URL | ✅ | listen.url() |
From Byte | ✅ | listen.bytes() |
From Audio Stream | ✅ | listen.live() , listen.liveListener() |
Text-to-Speech | Status | Methods |
---|---|---|
From Text | ✅ | speak.text() |
From Text Stream | ✅ | speak.live() , speak.liveSpeaker() |
Agent Interaction | Status | Methods |
---|---|---|
Agent Interaction | 🚧 | agent.live() |
PRs are welcome for all work-in-progress 🚧 features
All you need is a Deepgram API key. You can get a free one by signing up on Deepgram
First create the client with optional parameters
String apiKey = 'your_api_key';
// you can pass params in client's baseQueryParams or in every method's queryParams
Deepgram deepgram = Deepgram(apiKey, baseQueryParams: {
'model': 'nova-2-general',
'detect_language': true,
'filler_words': false,
'punctuation': true,
// more options here : https://developers.deepgram.com/reference/listen-file
});
Then you can call the methods you need under the propper listen or speak subclass:
// Speech to text
DeepgramListenResult res = await deepgram.listen.file(File('audio.wav'));
// Text to speech
DeepgramSpeakResult res = await deepgram.speak.text('Hello world');
All STT methods return a DeepgramListenResult
object with the following properties :
class DeepgramListenResult {
final String json; // raw json response
final Map<String, dynamic> map; // parsed json response into a map
final String? transcript; // the transcript extracted from the response
final String? type; // the response type (Result, Metadata, ...) non-null for streaming
}
All TTS methods return a DeepgramSpeakResult
object with the following properties :
class DeepgramSpeakResult {
final Uint8List? data; // raw audio data
final Map<String, dynamic>? metadata; /// The headers or metadata if streaming
}
let's say from a microphone :
// https://pub.dev/packages/record (other packages would work too)
Stream<List<int>> micStream = await AudioRecorder().startStream(RecordConfig(
encoder: AudioEncoder.pcm16bits,
sampleRate: 16000,
numChannels: 1,
));
final streamParams = {
'detect_language': false, // not supported by streaming API
'language': 'en',
// must specify encoding and sample_rate according to the audio stream
'encoding': 'linear16',
'sample_rate': 16000,
};
Deepgram deepgram = Deepgram(apiKey, baseQueryParams: streamParams);
then you got 2 options depending if you want to have more control over the stream or not :
// 1. you want the stream to manage itself automatically
Stream<DeepgramListenResult> stream = deepgram.listen.live(micStream);
// 2. you want to manage the stream manually
DeepgramLiveListener listener = deepgram.liveListener(micStream);
listener.stream.listen((res) {
print(res.transcript);
});
// connect to the servers and start sending data
listener.start();
// you can pause and resume the transcription (stop sending audio data to the server)
listener.pause();
// ...
listener.resume();
// then close the stream when you're done, you can call start() again if you want to restart a transcription
listener.close();
Deepgram deepgram = Deepgram(apiKey, baseQueryParams: {
'model': 'aura-asteria-en',
'encoding': "linear16",
'sample_rate': 16000,
// options here: https://developers.deepgram.com/reference/text-to-speech-api
});
then again you got 2 options:
final textStream = ...
// 1. you want the stream to manage itself automatically
Stream<DeepgramSpeakResult> stream = deepgram.speak.live(textStream);
// 2. you want to manage the stream manually
DeepgramLiveSpeaker speaker = deepgram.liveListener(textStream);
speaker.stream.listen((res) {
print(res);
// if you want to use the audio, simplest way is to use Deepgram.toWav(res.data) !
});
// start sending data to the servers
speaker.start();
// https://developers.deepgram.com/docs/tts-ws-flush
speaker.flush();
//https://developers.deepgram.com/docs/tts-ws-clear
speaker.clear();
// then close the stream when you're done, you can call start() again if you want to restart a transcription
speaker.close();
For more detailed usage check the /example
tab
Tested on Android and iOS, but should work on other platforms too.
- make sure your API key is valid and has enough credits
deepgram.isApiKeyValid()
- "Websocket was not promoted ..." : you are probably using wrong parameters, for example trying to use a whisper model with live streaming (not supported by deepgram)
- empty transcript/only metadata : if streaming check that you specified encoding and sample_rate properly and that it matches the audio stream
- double check the parameters you are using, some are not supported for streaming or for some models
I created this package for my own needs since there are no dart sdk for deepgram. Happy to share !
Don't hesitate to ask for new features or to contribute on GitHub !
If you'd like to support this project, consider contributing here. Thank you! :)