Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put VAD back as the IVR platform needs it to stop audio when input is detected #62

Open
MayamaTakeshi opened this issue Apr 17, 2024 · 0 comments

Comments

@MayamaTakeshi
Copy link
Owner

VAD was removed here due to problems with node-vad:
e6c98b1

Before trying again with node-vad, lets try this one:
https://github.com/OzymandiasTheGreat/libfvad-wasm

I did some experiments and it looks good:

import wav from 'wav'
import fs from 'fs'

import  Speaker from 'speaker'

import VADBuilder, { VADMode } from "@ozymandiasthegreat/vad";


const reader = new wav.Reader()

if (process.argv.length != 4) {
console.log("Expected: file_path output_file")
process.exit(1)
}
const file_path = process.argv[2]
const output_file = process.argv[3]

const file = fs.createReadStream(file_path)

file.pipe(reader)

reader.on('format', format => {
console.log('format', format)

VADBuilder().then((VAD) => {
console.log('VAD ready')
const vad = new VAD(VADMode.VERY_AGGRESSIVE, format.sampleRate);

reader.on('data', data => {
 //console.log('data.byteLength', data.byteLength)
 const size = 640
 var i = 0
 var chunk = data.buffer.slice(i, size)
 while(chunk.byteLength) {
           //console.log(chunk)
 //console.log('chunk.byteLength', chunk.byteLength)
 var samples = new Int16Array(chunk)
 var res = vad.processBuffer(samples)
     var output = Array.from(samples).map(sample => `${sample},${res}`);
 console.log(output)
 fs.appendFileSync(output_file, output.join('\n') + '\n');

 i += size
         chunk = data.buffer.slice(i, i+size)
         }
})

const speaker = new Speaker(format)
reader.pipe(speaker)
});
})

The above will generate a CSV file with samples and VAD result for each one of them.
This can be visualized using matplotlib:

import numpy as np
import matplotlib.pyplot as plt

# Read the output file
data = np.genfromtxt('res.csv', delimiter=',')

# Separate audio samples and VAD results
audio_samples = data[:, 0]
vad_results = data[:, 1]

# Plot audio signal
plt.figure(figsize=(10, 5))
plt.plot(audio_samples, label='Audio Signal')
plt.xlabel('Sample')
plt.ylabel('Amplitude')

# Overlay VAD results
vad_indices = np.where(vad_results == 1)[0]
plt.scatter(vad_indices, audio_samples[vad_indices], color='red', label='Voice Activity')

plt.legend()
plt.title('Audio Signal with VAD')
plt.show()
@MayamaTakeshi MayamaTakeshi changed the title Put VAD back as the IVR platform needs this stop audio when input is detected Put VAD back as the IVR platform needs it to stop audio when input is detected Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant