invalid str2bool value #1

ItsMe-TJ · 2022-11-14T21:32:58Z

I'm getting "transcribe.py: error: argument --diarization: invalid str2bool value: 'true'".

How do I fix this?

Oh and I have a question, how would I go about splitting the audio into individual files by speaker? maybe a feature you could add?

Thanks!

yinruiqing · 2022-11-15T15:49:03Z

I'm getting "transcribe.py: error: argument --diarization: invalid str2bool value: 'true'".

How do I fix this?

true -> True. I fixed readme

Oh and I have a question, how would I go about splitting the audio into individual files by speaker? maybe a feature you could add?

See Readme

Thanks!

ItsMe-TJ · 2022-11-15T15:55:33Z

Thank you so much! It would be cool If you could add a function to remove overlapped speech, the reason I'm asking is that I'm making datasets for training Text to Speech models. Using audio from podcasts etc where two people are talking, I know Pyannote has this function, but I'm honestly not savvy enough to implement it myself. Removing people talking over each other first before diarization could help in making really clean datasets. Either way, thank you so much.

yinruiqing · 2022-11-15T16:04:59Z

Thank you so much! It would be cool If you could add a function to remove overlapped speech

I will do it this weekend.

ItsMe-TJ · 2022-11-15T16:12:37Z

Thank you so much! It would be cool If you could add a function to remove overlapped speech

I will do it this weekend.

Thank you!

yinruiqing · 2022-11-19T03:38:49Z

@ItsMe-TJ I what you want is the following function:

def remove_overlap_part(ann):
    overlap = to_overlap(ann).get_timeline()
    if len(overlap) == 0:
        return ann
    else:
        overlap_start = overlap[0].start
        overlap_end = overlap[-1].end
        ann_start = ann.get_timeline()[0].start
        ann_end = ann.get_timeline()[-1].end
        non_overlap = overlap.gaps()
        if overlap_start > ann_start:
            non_overlap.add(Segment(ann_start, overlap_start))
        if ann_end > overlap_end:
            non_overlap.add(Segment(overlap_end, ann_end))            
        return ann.crop(non_overlap)

By the way, I also have an open-source tts project deepaudio-tts. I need someone else to work together with me on that. Are you interested in it?

ItsMe-TJ · 2022-11-19T11:03:58Z

I am always interested in TTS stuff! So absolutely! Though when you say "Work together with me" I don't know how to code or anything lol, but I'm happy to help in any way I can!

ItsMe-TJ · 2022-11-19T11:36:07Z

Okay so I have an idea, and since you're doing the whole TTS thing you might benefit from it as well!

So let's say I have a podcast episode where 2 people are talking, I want to specify the number of speakers, and then remove overlapped speech, output the non-overlapped speech into a new audio file. Then take that audio file and do diarization, splitting the audio by speaker and outputting those new audio files into folders. Speaker 1, Speaker 2 etc. This would in my mind, make clean datasets for training TTS.

So that's my idea, hopefully you understand why I asked about removing overlapped speech and splitting the audio.

idk If you're interested in taking a crack at it! As I said, I don't code - though I really REALLY wish I knew how, I've tried learning many times but I think I'm just too dumb lol

yinruiqing · 2022-11-20T09:19:02Z

Good idea. I will implement it soon and put it to deepaudio-tts.

ItsMe-TJ · 2022-11-21T00:19:25Z

Good idea. I will implement it soon and put it to deepaudio-tts.

Great! It will be really helpful, and I can't wait it to try it!

ItsMe-TJ · 2022-12-01T19:54:35Z

How's it going? I know It's probably not easy, just curious on the progress!

yinruiqing · 2022-12-03T12:07:58Z

How's it going? I know It's probably not easy, just curious on the progress!

I gonna finish in one week. I need someone familiar with frontend skills to help me with the interaction.

yinruiqing · 2022-12-03T14:19:22Z

@ItsMe-TJ You can use the following code.

import numpy as np
from scipy.io.wavfile import write
def save_wave_from_numpy(data, f, rate=16000):
    scaled = np.int16(data / np.max(np.abs(data)) * 32767)
    write(f, rate, scaled)

import whisper
from pyannote.audio import Pipeline
from pyannote.audio import Audio
from pyannote_whisper.utils import diarize_text
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
                                    use_auth_token="your/token")
model = whisper.load_model("tiny.en")
diarization_result = pipeline("data/afjiv.wav")

from pyannote.audio import Audio
audio = Audio(sample_rate=16000, mono=True)
audio_file = "data/afjiv.wav"
result_without_overlap = remove_overlap_part(diarization_result)
for segment, _, speaker in diarization_result.itertracks(yield_label=True):
    waveform, sample_rate = audio.crop(audio_file, segment)
    filename = f"{segment.start:.2f}s_{segment.end:.2f}s_{speaker}.wav"
    save_wave_from_numpy(waveform.squeeze().numpy(), filename)
    text = model.transcribe(waveform.squeeze().numpy())["text"]
    print(f"{segment.start:.2f}s {segment.end:.2f}s {speaker}: {text}")

yinruiqing · 2023-01-07T10:47:45Z

@ItsMe-TJ I am working on audio-annotation. It will provide an easy way to export audio segments for a single speaker.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

invalid str2bool value #1

invalid str2bool value #1

ItsMe-TJ commented Nov 14, 2022 •

edited

Loading

yinruiqing commented Nov 15, 2022 •

edited

Loading

ItsMe-TJ commented Nov 15, 2022 •

edited

Loading

yinruiqing commented Nov 15, 2022

ItsMe-TJ commented Nov 15, 2022

yinruiqing commented Nov 19, 2022 •

edited

Loading

ItsMe-TJ commented Nov 19, 2022

ItsMe-TJ commented Nov 19, 2022

yinruiqing commented Nov 20, 2022

ItsMe-TJ commented Nov 21, 2022

ItsMe-TJ commented Dec 1, 2022

yinruiqing commented Dec 3, 2022

yinruiqing commented Dec 3, 2022 •

edited

Loading

yinruiqing commented Jan 7, 2023

invalid str2bool value #1

invalid str2bool value #1

Comments

ItsMe-TJ commented Nov 14, 2022 • edited Loading

yinruiqing commented Nov 15, 2022 • edited Loading

ItsMe-TJ commented Nov 15, 2022 • edited Loading

yinruiqing commented Nov 15, 2022

ItsMe-TJ commented Nov 15, 2022

yinruiqing commented Nov 19, 2022 • edited Loading

ItsMe-TJ commented Nov 19, 2022

ItsMe-TJ commented Nov 19, 2022

yinruiqing commented Nov 20, 2022

ItsMe-TJ commented Nov 21, 2022

ItsMe-TJ commented Dec 1, 2022

yinruiqing commented Dec 3, 2022

yinruiqing commented Dec 3, 2022 • edited Loading

yinruiqing commented Jan 7, 2023

ItsMe-TJ commented Nov 14, 2022 •

edited

Loading

yinruiqing commented Nov 15, 2022 •

edited

Loading

ItsMe-TJ commented Nov 15, 2022 •

edited

Loading

yinruiqing commented Nov 19, 2022 •

edited

Loading

yinruiqing commented Dec 3, 2022 •

edited

Loading