Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUGGESTION] NN & Video synch #24

Open
MarcoRavich opened this issue Sep 16, 2021 · 10 comments
Open

[SUGGESTION] NN & Video synch #24

MarcoRavich opened this issue Sep 16, 2021 · 10 comments

Comments

@MarcoRavich
Copy link

Hi there, audalign is very cool !

We would suggest to keep in consideration some interesting fingerprint projects in order to evolve it even more:

Please check out AudioAlign - a tool written for research purposes to automatically synchronize audio and video recordings that have either been recorded in parallel at the same event or contain the same aural information - by @protyposis too, wich have a very cool advanced GUI.

Hope that inspires !

@Johndirr
Copy link

I can only second AudioAlign, the Alignment Graph of the tool is such a cool feature.

@benfmiller
Copy link
Owner

Thanks for the heads up!
I will definitely plan on incorporating this with AudioAlign. That does seem nifty!

That looks like a useful technique! Hopefully, NN would be able to handle non-tonal audio better, too.

@MarcoRavich
Copy link
Author

Hi there, we're glad our suggestions inspired you !

Anyway @protyposis's AudioAlign is based on his own .NET Aurio library, so we believe that first of all it would be great to "put togheter" (best ?) audio fingerprinting techniques into a platform-indipendent library....

...we do also decided to collect (open) audio fingerprinting resources into a specific page under our HyMPS project: https://github.com/forart/HyMPS/blob/main/Fingerprinting.md

@benfmiller
Copy link
Owner

What do you mean by platform-independent library? Like separate the techniques into a separate python project? Rewrite them in a different language? The non-fingerprinting techniques can be highly effective as well.

That seems like a neat project!

@MarcoRavich
Copy link
Author

MarcoRavich commented Sep 18, 2021

Well, to get a precise alignment, accurate fingerprinting seems essential (even if your software uses a different approach) AFAIK.

Aurio library, for exampe, implements various audio fingerprinting methods that allows AudioAlign to be more accurate.
BTW since it's .NET backended is NOT platform-indipendent, so it's basically not usable on any OS.

Looking at the main open source AV libraries, they are mainly developed in C++ just to make them independent from (so usable in) any platform.

However we understand that asking to "put togeter" a C++ audio fingerprint library "from scratch" is perhaps a bit too ambitious, even if there are some ready-made libs such as:

We're going/trying to stimulate a collaboration between those libs projects, but a multiplatform "GUI" counterpart needed (seme as AudioAlign or your audalign) to exploit its potential.

Hope that inspires.

@benfmiller
Copy link
Owner

The biggest drawback to fingerprinting is that it is based on finding peaks in the spectrogram, so audio with non-tonal sound events (footsteps, cars, crashes, doors closing) don't show up well. The fingerprints of those events are fairly random. Types of correlation techniques can help a lot with those. Plus, fingerprints are only as accurate time-wise as the FFT window, which is usually around 0.04 seconds, compared to correlation's 0.001 seconds. I've found that a combination of correlation and fingerprints usually works best with those types of audio events.

I've considered a rust port of audalign, if that would be of interest?

I've also been looking into flutter, which is very cross platform. Would you have any interest in a flutter application that could call those C++ libraries? Kind of in the style of AudioAlign?

@MarcoRavich
Copy link
Author

The biggest drawback to fingerprinting is that it is based on finding peaks in the spectrogram, so audio with non-tonal sound events (footsteps, cars, crashes, doors closing) don't show up well. The fingerprints of those events are fairly random. Types of correlation techniques can help a lot with those. Plus, fingerprints are only as accurate time-wise as the FFT window, which is usually around 0.04 seconds, compared to correlation's 0.001 seconds. I've found that a combination of correlation and fingerprints usually works best with those types of audio events.

Well, multiple kind of alignment approaches (not only strictly-audio fingerprinting) can achieve optimal results, of course.
A lib - something similar to Aurio - that collects them (all ?) would be great.

I've considered a rust port of audalign, if that would be of interest?

It could be very interesting to let 3rd party sw (DAWs, NLEs, etc) exploit it, so yes it would be cool.

I've also been looking into flutter, which is very cross platform. Would you have any interest in a flutter application that could call those C++ libraries? Kind of in the style of AudioAlign?

As said, alignment algorithms and GUIs to exploit them should be independent of each other ideally.

This approach could also allows devs to collaborate - and evolve - better, in our opinion.

Thanks for your sake in this discussion !

@MarcoRavich
Copy link
Author

News: seems that @lutzray's (hw) synch solution is working...
asciicast

@MarcoRavich
Copy link
Author

MarcoRavich commented Dec 10, 2023

Bump.

After 5 years, @protyposis released new versions of both AudioAlign and Aurio (with prebuilt Windows binaries):

https://github.com/protyposis/AudioAlign/releases
https://github.com/protyposis/Aurio/releases

Enjoy !

@MarcoRavich
Copy link
Author

Bump 2024.

There are many interesting improvements both in recent Aurio and AudioAlign releases,

Some end users' (like us) tests have detected malfunctions, omissions and possible improvements to fix/add, but it would be even more interesting to have some "alignment experts" - like you - feedbacks/opinions too:

Thanks in advance !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants