Skip to content

Latest commit

 

History

History
53 lines (34 loc) · 3.06 KB

File metadata and controls

53 lines (34 loc) · 3.06 KB

Start Riva Server

Before doing anything, you should download and run the Riva server container from riva_quickstart_arm64 using riva_start.sh

This will run locally on your Jetson Xavier or Orin device and is supported on JetPack 5. You can disable NLP/NMT in its config.sh and it will use around ~5GB of memory for ASR+TTS. It's then recommended to test the system with these examples under /opt/riva/python-clients

You can also see this helpful video and guide from JetsonHacks for setting up Riva: Speech AI on Jetson Tutorial

List Audio Devices

This will print out a list of audio input/output devices that are connected to your system:

./run.sh --workdir /opt/riva/python-clients $(./autotag riva-client:python) \
   python3 scripts/list_audio_devices.py

You can refer to them in the steps below by either their device number or name. Depending on the sample rate they support, you may also need to set --sample-rate-hz below to a valid frequency (e.g. 16000 44100 48000)

Streaming ASR

./run.sh --workdir /opt/riva/python-clients $(./autotag riva-client:python) \
   python3 scripts/asr/transcribe_mic.py --input-device=24 --sample-rate-hz=48000

You can find more ASR examples to run at https://github.com/nvidia-riva/python-clients#asr

Streaming TTS

./run.sh --workdir /opt/riva/python-clients $(./autotag riva-client:python) \
   python3 scripts/tts/talk.py --stream --output-device=24 --sample-rate-hz=48000 \
     --text "Hello, how are you today? My name is Riva." 

You can set the --voice argument to one of the available voices (the default is English-US.Female-1)

Also, you can customize the rate, pitch, and pronunciation of individual words/phrases by including inline SSML in your text.

Loopback

To feed the live ASR transcript into the TTS and have it speak your words back to you:

./run.sh --workdir /opt/riva/python-clients $(./autotag riva-client:python) \
   python3 scripts/loopback.py --input-device=24 --output-device=24 --sample-rate-hz=48000