Clarifications on annotation #211

MikeMpapa · 2024-10-14T22:53:38Z

Hi there,
I am working on a building a new dataset in Spanish (polysyllabic language). I have gone though MakeDiffSinger but I still have some gaps. I would be grateful if you could sanity check me on my understanding and share any thoughts you might have

Questions for clarifications:

ph_seq: These are sequences of phonemes or syllables?
Currently I using phonemes and their timestamps as provided by MFA. I am using a pre-trained Spanish model available by MFA. Would you recommend training a new one on my specific data?
note_dur: The midi notes should be estimated over phonemes, syllables, or words?
Now I estimated one note for each phoneme and assumed ph_dur==note_dure
ph_num: The number of phonemes in each word or in each syllable?
Now I assumed the number of phonemes in each word
note_seq: Do you think SOME would suffice to get a first shot at this ? I would speculate yes?
is_slur: how would you define slur in this context? I have not found plenty of resources on this topic
Now I assumed no slurs at all
SPs and APs: Would you recommend doing that manually or using the enhance script might be OK for a first shot?

Thanks!

MikeMpapa changed the title ~~Clarification on annotation~~ Clarifications on annotation Oct 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarifications on annotation #211

Clarifications on annotation #211

MikeMpapa commented Oct 14, 2024 •

edited

Loading

Clarifications on annotation #211

Clarifications on annotation #211

Comments

MikeMpapa commented Oct 14, 2024 • edited Loading

MikeMpapa commented Oct 14, 2024 •

edited

Loading