Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec-to-wav synthesis produces some errors from existing mels #513

Open
roedoejet opened this issue Jul 18, 2024 · 2 comments
Open

spec-to-wav synthesis produces some errors from existing mels #513

roedoejet opened this issue Jul 18, 2024 · 2 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@roedoejet
Copy link
Member

For some reason I had to change write(f"{data_path}.wav", sr, wav) to write(f"{data_path}.wav", sr, wav[0]) - I should investigate in hfgl/cli.py

@roedoejet roedoejet added the bug Something isn't working label Jul 18, 2024
@roedoejet roedoejet self-assigned this Jul 18, 2024
@roedoejet
Copy link
Member Author

related to #507

@roedoejet roedoejet added this to the beta milestone Sep 9, 2024
roedoejet added a commit that referenced this issue Oct 30, 2024
roedoejet added a commit that referenced this issue Oct 30, 2024
roedoejet added a commit that referenced this issue Oct 31, 2024
@SamuelLarkin
Copy link
Collaborator

I also stumbled on this using

everyvoice synthesize from-spec \
  --input preprocessed/spec/LJ033-0048--speaker_0--eng--spec-22050-mel-librosa.pt \
  --model logs_and_checkpoints/VocoderExperiment/base/checkpoints/last.ckpt
╭───────────────────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────────────────╮
│ /fs/hestia_Hnrc/ict/sam037/git/EveryVoice/everyvoice/model/vocoder/HiFiGAN_iSTFT_lightning/hfgl/cli.py:154 in synthesize                     │
│                                                                                                                                              │
│   151 │   except (TypeError, ValidationError) as e:                                                                                          │
│   152 │   │   logger.error(f"Unable to load {generator_path}: {e}")                                                                          │
│   153 │   │   sys.exit(1)                                                                                                                    │
│ ❱ 154 │   wav, sr = synthesize_data(data, vocoder_model, vocoder_config)                                                                     │
│   155 │   logger.info(f"Writing file {data_path}.wav")                                                                                       │
│   156 │   write(f"{data_path}.wav", sr, wav[0])                                                                                              │
│   157                                                                                                                                        │
│                                                                                                                                              │
│ /fs/hestia_Hnrc/ict/sam037/git/EveryVoice/everyvoice/model/vocoder/HiFiGAN_iSTFT_lightning/hfgl/utils.py:85 in synthesize_data               │
│                                                                                                                                              │
│    82 │   │   wavs = inverse_spectral_transform(mag * torch.exp(phase * 1j)).unsqueeze(-2)                                                   │
│    83 │   else:                                                                                                                              │
│    84 │   │   with torch.no_grad():                                                                                                          │
│ ❱  85 │   │   │   wavs = model.generator(data.transpose(1, 2))                                                                               │
│    86 │   # squeeze to remove the channel dimension                                                                                          │
│    87 │   return (                                                                                                                           │
│    88 │   │   wavs.squeeze(1).cpu().numpy(),                                                                                                 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants