- Node.js
- FFmpeg
- SoX (Sound eXchange)
- Google Cloud SDK (for authentication and accessing the STT and TTS APIs)
- Deepgram
- Puppeteer-extra
- Alots alots of patience and determination
- Clone the repository.
- Navigate to the cloned directory.
- Install the dependencies.
- Set up environment variables in a
.env
file.
Run the script with the command node mainBot.js
.
The script performs the following tasks:
- Launches a headless browser using Puppeteer and navigates to a meeting URL.
- Joins the meeting by clicking the appropriate button.
- Prompts for the user's name and clicks the join button.
- Records audio from the browser using SoX.
- Processes the recorded audio: converts it to mono, increases the volume, and saves it.
- Transcribes the processed audio using Deepgram and use OpenAI GPT to Generate a GPT response.
- Synthesizes a response using the Google Cloud Text-to-Speech API.
- Writes the synthesized response to an audio file.
- Converts the synthesized audio to a microphone stream and plays it in the browser.
- Closes the browser and ends the script.
Modify the script as needed and explore additional functionalities.
Create a .env
file in the project directory and set the following environment variables:
GPT_KEY
: Your OpenAI GPT API key.DEEPGRAM_KEY
: Your Deepgram API key.
The ./audios
directory contains the following audio files:
fromBrowser.wav
: The recorded audio from the browser.output.wav
: The processed audio (converted to mono and increased volume).final.wav
: The synthesized response audio.
You can use these files for further analysis or customization.
Remember to respect the terms of service and privacy of the services used in this project.
For any questions or issues, please contact [email protected].