Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sampling rate for reading opus files #158

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

hagenw
Copy link
Member

@hagenw hagenw commented Dec 17, 2024

Closes #157

To add a failing test for #157 we first convert the gs-16b-1c-44100hz.opus test file (which had anyway a sampling rate of 48000 Hz) to gs-16b-1c-16000hz.opus with a sampling rate of 16000 Hz. This raises an error for the current main branch as the sampling_rate returned by

signal, sampling_rate = af.read(path)

does not match the sampling rate compared to at
assert af.sampling_rate(path) == sampling_rate

To fix the issue, the conversion code is updated and sox and ffmpeg are now both getting a sampling_rate argument, which is extracted with mediainfo inside audiofile.read() for non-snd files (everything besides wav, flac, mp3, ogg).

As a side effect, we can no longer easily test the error message for a missing ffmpeg binary. In the tests we can only hide the general PATH variable, which means ffmpeg and mediainfo binaries are missing at the same time, which will always first raise an error for mediainfo now as this is used to access the sampling rate first, before calling ffmpeg.

I also checked that the proposal introduced here does not slow down reading of MP4 files by using our benchmark.

Summary by Sourcery

Fix the sampling rate handling for non-SND files by introducing a sampling rate parameter in the audio conversion functions and updating the tests accordingly.

Bug Fixes:

  • Fix the issue with incorrect sampling rate handling for non-SND files by adding a sampling rate parameter to the conversion functions.

Enhancements:

  • Enhance the audio conversion utility functions to accept an optional sampling rate parameter, allowing for more flexible audio processing.

Tests:

  • Update tests to reflect changes in error handling and to include the new sampling rate parameter in conversion functions.

Copy link

sourcery-ai bot commented Dec 17, 2024

Reviewer's Guide by Sourcery

This PR fixes sampling rate handling for non-SND files by introducing sampling rate parameters to audio conversion functions. The implementation extracts the sampling rate using mediainfo before conversion and passes it to both sox and ffmpeg converters. The changes ensure correct sampling rate preservation during audio file conversion.

Sequence diagram for audio file conversion with sampling rate

sequenceDiagram
    participant User
    participant AudioFile
    participant MediaInfo
    participant Sox
    participant FFmpeg
    participant SoundFile

    User->>AudioFile: Call read(file)
    AudioFile->>MediaInfo: Get sampling rate
    MediaInfo-->>AudioFile: Return sampling rate
    AudioFile->>Sox: Try convert(file, tmpfile, offset, duration, sampling_rate)
    alt Sox conversion fails
        AudioFile->>FFmpeg: Try convert(file, tmpfile, offset, duration, sampling_rate)
        alt FFmpeg conversion fails
            AudioFile->>User: Raise binary_missing_error("ffmpeg")
        else FFmpeg conversion succeeds
            FFmpeg-->>AudioFile: Conversion complete
        end
    else Sox conversion succeeds
        Sox-->>AudioFile: Conversion complete
    end
    AudioFile->>SoundFile: Read tmpfile
    SoundFile-->>AudioFile: Return signal, sampling_rate
    AudioFile-->>User: Return signal, sampling_rate
Loading

Updated class diagram for audio conversion functions

classDiagram
    class AudioFile {
        +read(file)
    }
    class MediaInfo {
        +get_sampling_rate(file)
    }
    class Sox {
        +run_sox(infile, outfile, offset, duration, sampling_rate)
    }
    class FFmpeg {
        +run_ffmpeg(infile, outfile, offset, duration, sampling_rate)
    }
    class Convert {
        +convert(infile, outfile, offset, duration, sampling_rate)
    }
    AudioFile --> MediaInfo : uses
    AudioFile --> Sox : uses
    AudioFile --> FFmpeg : uses
    AudioFile --> Convert : uses
    Convert --> Sox : calls
    Convert --> FFmpeg : calls
Loading

File-Level Changes

Change Details Files
Added sampling rate parameter to audio conversion utilities
  • Added sampling_rate parameter to run_ffmpeg() function with -ar flag support
  • Added sampling_rate parameter to run_sox() function with rate command support
  • Refactored command building logic to be more flexible
  • Updated convert() function to accept and pass through sampling_rate parameter
audiofile/core/utils.py
audiofile/core/convert.py
Modified audio file reading logic to preserve sampling rate
  • Added sampling rate extraction before conversion for non-SND files
  • Updated read() function to pass sampling rate to conversion utilities
  • Added sampling rate initialization and handling logic
audiofile/core/io.py
Updated tests to accommodate new sampling rate handling
  • Changed test file from 44100Hz to 16000Hz opus file
  • Updated error message expectations for missing binaries
  • Modified test assertions to include sampling rate parameter
tests/test_audiofile.py
tests/assets/README.md

Assessment against linked issues

Issue Objective Addressed Explanation
#157 Fix incorrect sampling rate handling when reading OPUS files
#157 Make audiofile.read() and related functions return correct sampling rate matching audiofile.sampling_rate()

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

codecov bot commented Dec 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.0%. Comparing base (69b0c97) to head (ffe0389).

Additional details and impacted files
Files with missing lines Coverage Δ
audiofile/core/convert.py 100.0% <100.0%> (ø)
audiofile/core/io.py 100.0% <100.0%> (ø)
audiofile/core/utils.py 100.0% <100.0%> (ø)

@hagenw hagenw marked this pull request as ready for review December 17, 2024 14:14
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hagenw - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

audiofile/core/convert.py Outdated Show resolved Hide resolved
@hagenw hagenw changed the title Fix sampling rate for non SND files Fix sampling rate for opus files Dec 17, 2024
@hagenw hagenw changed the title Fix sampling rate for opus files Fix sampling rate for reading opus files Dec 17, 2024
@hagenw hagenw requested a review from ChristianGeng December 19, 2024 13:15
@@ -15,6 +15,12 @@ Kevin MacLeod (incompetech.com),
licensed under Creative Commons:
[CC-BY-3.0](http://creativecommons.org/licenses/by/3.0/).

We converted the file `gs-16b-1c-44100hz.opus`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a typo? Did you mean the .aac or the .m4a one?
My understanding is that you are forcing mono and
16KHz.

total 1020
-rw-rw-r-- 1 cgeng cgeng 137654 Dez 19 15:37 gs-16b-1c-16000hz.opus
-rw-r--r-- 1 cgeng root  137099 Aug  2 03:09 gs-16b-1c-44100hz.aac
-rw-r--r-- 1 cgeng root  649912 Aug  2 03:09 gs-16b-1c-44100hz.m4a
-rw-r--r-- 1 cgeng root   25350 Aug  2 03:09 gs-16b-1c-8000hz.amr
-rw-rw-r-- 1 cgeng cgeng    916 Dez 19 15:37 README.md
-rw-rw-r-- 1 cgeng cgeng  81873 Dez 19 15:36 video.mp4

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now that there is a deletion

deleted tests/assets/gs-16b-1c-44100hz.opus (binary)

This is because opus uses 16KHZ without telling the user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more precise I now write:

We converted the file gs-16b-1c-44100hz.opus
(which was stored wrongly with 48000 Hz)

The problem with the gs-16b-1c-44100hz.opus file in the tests was that it was stored with 48000 Hz, not 44100 as claimed. This also means when reading it with ffmpeg, which converts all opus files to a sampling rate of 48000 Hz, the sampling rate did match in the tests and we were not able to spot #157 before.
I fixed this by enforcing the correct sampling rate, and decided also to go with 16000 Hz instead of 44100 Hz, in order to have a little bit of variation compared to the other files, and to highlight that we changed the original gs-16b-1c-44100hz.opus file.

@@ -386,7 +387,11 @@ def read(
offset /= sampling_rate
if duration is not None and duration != 0:
duration /= sampling_rate
convert(file, tmpfile, offset, duration)
if sampling_rate is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the essential bit then:

for opus conversion, you are forcing convert to have the sampling rate parameter.
Should one add a comment why this is needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I added the following comment:

# Infer sampling rate using mediainfo before conversion,
# as ffmpeg does ignore the original sampling rate for opus files,
# see:
# * https://trac.ffmpeg.org/ticket/5240
# * https://github.com/audeering/audiofile/issues/157

Copy link
Member

@ChristianGeng ChristianGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be curious to understand what I could do as a review.

Code-wise there is not much to say, one could possibly describe the
unusual behavior of opus in one sentence somewhere (don't even know where)

The unittests are all passing. My question whether it should entail testing against further external audio data. If not I would continue with approval.

@hagenw
Copy link
Member Author

hagenw commented Jan 2, 2025

I would be curious to understand what I could do as a review.

Code-wise there is not much to say, one could possibly describe the unusual behavior of opus in one sentence somewhere (don't even know where)

The unittests are all passing. My question whether it should entail testing against further external audio data. If not I would continue with approval.

I think you addressed the important points in your review, and I tried to improve on them.

There is not really a change how we test the opus file, as the test file was broken before, and the test was hence passing for the main branch before. At the moment we are testing only for a few non standard audio formats:

  • mp4 (which is also not really a format, but a container)
  • opus
  • aac
  • m4a
  • amr

If you think we should expand on this, please open an issue.

@hagenw hagenw force-pushed the fix-opus-sampling-rate branch from 5edab36 to ffe0389 Compare January 3, 2025 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OPUS files and ground truth for sampling rate
2 participants