-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Factor out amplitude in the Correlation (via phase) filter reporting. #10
Comments
didn't this happen in 4bc596d? |
No. That solved the problem of DC offset in the signal skewing the phase filter data. This issue is that amplitude offsets between channels skew the phase filter data. |
In the folder of test files (linked above) there are two pngs of two 2-channel mono recordings that illustrate the level/correlation issue. They are the same program, but one version was made with a channel offset of .6 dB and the other has a channel offset of 5 dB: |
Here is an additional file which may illustrate the issue better: https://drive.google.com/drive/folders/1j0sdy_byuBzaHmrtsrKV7XdUZkm1TY0k?usp=sharing |
Hi @Soundmatters, I've tested this a few ways with having loudnorm and ebur128 put a rolling normalization before the phasemeter test, but the results are messy compared to the axcorrelation graph which we had removed. With the sample you shared, it seems well correlated but has amplitude differences, so I took one channel and offset some of the samples for a few minutes so I could force a loss of correlation. Here's the graph just before axcorrelate was removed in eaf656a. This is on your sample with a section of audio in a single channel offset to force a correlation issue. Perhaps this was due to the work of the preceding commits, but I see the axcorrelation graph in this commit was problematic. It shows the correlation but the x-axis is halved. In my work in progress it looks like: So here the axcorrelation aligns well with the phase graph. They both show the issue but the phase graph factors in amplitude whereas the axcorrelation one doesn't. So this all has me trying to remember why the axcorrelation graph was dropped as I haven't found anything better than it for plotting phase correlation without factoring in amplitude. Beyond being more accurate it also is much faster than adding in a rolling normalization step before the phasemeter analysis. |
Hi @Soundmatters, I think you replied via email rather than at #10, so the image attachments didn't come through. |
So the way I remember it, we dropped axcorrelation in the reporting because
And here is another example of very different reporting. The aphasemeter |
Please note that the first image that I posted yesterday was not correct; it has been replaced with the correct image. |
Hey @Soundmatters, with my latest branch here is a graph of the output, including a revised contextualization of axcorrelate so you can compare the before and after.
In the new one there's no values over 1. From a skim, I see 0.999989 but no 1's.
I reran this process with a In the new version the aphasemeter values and axcorrelate values are roughly similar. Let me know what you think, if it seems okay, I can merge into a new release for your testing. |
Fantastic! If you don’t think that it would slow down astataudit’s processing time too much, could we leave both the “Correlation” graphic and this updated “Normalized Cross Correlation” in for the time being? It would give us the chance to compare their analysis over large sets of data. And if you could add the same color scale to the “Normalized Cross Correlation” graphic, that might help with the comparison, though not at all essential if you are out of time. Thanks. |
Hi @Soundmatters, here's the 2 minute sample with the color patterns matched. But you're right, there is a notable speed difference. I ran this on I should note that between the last release and the current draft, the axcorrelation isn't the only addition but there's the spectrum as well. |
The default/slow setting certainly seems more accurate. I’m not sure how helpful this would be, but if dropping some of the other filter graphics would lighten the processing load, I’d say: 1) keep axcorrelate in the default/slow mode, 2) drop zero crossings for now, 3) drop the spectral analysis for now. The correlation reporting is so important and useful that its accuracy is a primary concern for us. Maybe, down the road, zero crossings and spectral analysis could be added as options. |
Hey @Soundmatters, this is a bit of an investment for future work, but I refactored the way the graph is constructed to separate each analyzer (they were all tangled together before). This should make it a lot easier for me to scale it to add future analyzers. So with this I can turn them on/off and benchmark. So if I only use one at a time: astats (without reset) which is for dcoffset astats (with a reset every frame) aphasemeter all (with fast axcorrelation) all (with slow axcorrelation) These were just quick single run tests but obviously something was off as the solo run of slow axcorrelation was slower than that with all the others. |
I'm wondering about the last graphic in the previous post "all (with slow axcorrelation) 9:07" Is that correct? The Normalized Cross Correlation data looks inaccurate (see the uncorrelated Dolby Tone in the first 30 seconds displaying as almost +1; also other data looks like it is displayed at half its value). |
I'm having trouble understanding the comment. You suggest the last graphic is incorrect, but is there one here that is correct? |
The Cross Correlation analysis in the two-minute examples that you posted seems accurate. I’m looking at the first 30 seconds of Dolby tone (which should be uncorrelated) followed by 30 seconds of a 1 k sine wave (which should be correlated). The latest graphic that you posted, “(with slow axcorrelation) 9:07", that Dolby tone looks almost perfectly correlated in the Cross Correlation graphic; we’d expect it to be almost 0, not almost +1. |
Note that I added 'best' algorithm for axcorrelate filter, it should be more correct than 'fast' algorithm at similar speeds of 'fast' algorithm, but may give you some wrong results compared to 'slow' one especially when used with float sample format. Anyway you should always use double floating point sample format with this filter due limited precision of floats, by using aformat=dblp prior to axcorrelate, in that case it will be more correct for 16-32 bit inputs almost always even with bigger window sizes. |
The speed has improved a great deal, but I wasn't able to get any improved aphasemeter or axcorrelate filter reporting with the recent development build. I don’t have the skills to do this but, if someone is willing to experiment, it might be worth pushing axcorrelate’s “size” parameter to see if we can get more accurate reporting that way. The range it allows is 2 to 131072. If I’m reading the script correctly, the filter size is set at 1024 now. I think that it would be worth experimenting with the upper end of the range (like 32768 or larger). If we can improve the accuracy that way, it might be worth sacrificing some of astataudit’s improved processing speed. |
I created a new test file, found here, which I think illustrates the aphasemeter (called Correlation on the png) and axcorrelate (called Normalized Cross Correlation on the png) reporting issues better than other files that I made in the past. Here is the layout of the test file’s audio data: Section 1 Section 2 Section 3 I've included the astataudit reports for the full file, with the audio file itself, in the link above. I’ve attached a detail of the png report here; it seems like the clearest illustration so far of the problem.
|
Just run axcorrelate filter on that .wav file through direct showwaves filter output, and in dolby A tone section and its amplitude goes up/down (because its measuring it per each sample). Perhaps this graph picks max values instead of mean ones in certain timeline window, that is main reason why results are incorrect. |
Hi @Soundmatters, yes @richardpl's clue helped me here. I had been plotting the max level value of the axcorrelated output. Here is the current state of the output of the aphasemeter filter alongside the current axcorrelate output which relies on the max value: And here is that same data but plotting the DC Offset of the axcorrelate output, rather than the Max level. And, as I was curious, here's the min level. And what it looks like when all 3 are plotted together: Does switching the plot of the axcorrelated data from the max value to the dc offset resolve the issue for you @Soundmatters? |
It certainly seems resolved from this example. Thank you @dericed, and thanks to @richardpl for your insight. Very interesting and helpfull to see the max., mean, and min. values plotted together. Thanks for adding that example. |
Note that I had made big changes in Librempeg version of axcorrelate filter, the math behind slow and best modes should give more correct results than before, and also faster, because unnecessary float divisions (slowed calculations and hurt precision) have been removed. |
Factor out amplitude in the Correlation (via phase) filter reporting. As it stands now, level offsets between channels impact the reporting of ffmpeg’s phase filter. A test file is available here: https://drive.google.com/drive/folders/1PieIpN5w_IvTzfaRYoiPmJJCXlHyZXx8?usp=share_link
The text was updated successfully, but these errors were encountered: