Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving the representation inverter by exploring more advanced models and/or reaching out to some of the authors #2

Open
bs opened this issue Oct 23, 2020 · 0 comments
Assignees
Labels

Comments

@bs
Copy link
Member

bs commented Oct 23, 2020

It seems very strange to me that the methods work super well for conventional STFT spectrograms but not so well for other reps. One reason why representations might be important for bioacoustic applications is the high sampling rates required. For instance, many dolphin recordings are samples at 96kHz. While it might be possible to downsample a bit and still avoid aliasing, this would lead to an audio input with ~300k elements. However we could also try slicing audio inputs into more reasonable frame lengths and then concatenating results during inference. This I suppose is related to the “Variable Time Scales I’m Vocal Behavior” problem to some degree

@bs bs added the Epic label Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants