Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add audinterface.Segment.process_table() #172

Merged
merged 34 commits into from
Jun 4, 2024
Merged

Add audinterface.Segment.process_table() #172

merged 34 commits into from
Jun 4, 2024

Conversation

maxschmitt
Copy link
Contributor

@maxschmitt maxschmitt commented Apr 29, 2024

New method Segment.process_table().
Added Usage instructions in Usage.rst and tests.

#167 (comment)

image

From usage documentation:

image

@maxschmitt
Copy link
Contributor Author

I don't get why we run into a

E           soundfile.LibsndfileError: Error opening '/home/runner/work/audinterface/audinterface/file.wav': System error.

for most of the test when running process_table(). The method works when testing locally.

Concerning Linter, this is also not clear to me:

ruff-format..............................................................Failed
- hook id: ruff-format
- files were modified by this hook

I checked all files with ruff and there were no issues.

@hagenw
Copy link
Member

hagenw commented Apr 30, 2024

Regarding ruff, we use two command (linting with automatic fix, and automatic code formatting, compare https://github.com/audeering/audinterface/blob/main/CONTRIBUTING.rst#coding-convention).

The easiest solution locally would be to run pre-commit as well, e.g. (after installing it)

$ pre-commit install
$ pre-commit run --all-files

When running it the first time it will fail as it had to make changes, but when running it again you will see that it passes.
You can then inspect (and commit) the changes it did.

@hagenw
Copy link
Member

hagenw commented Apr 30, 2024

The test fails with the same error locally for me. /home/hwierstorf/git/audeering/audinterface/file.wav does also not look like a correct path, as usually we store the files for testing inside a tempdir, so maybe you are handing the wrong path to the processing function?

tests/test_segment.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented May 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.0%. Comparing base (bdb078c) to head (b9b6ee2).

Additional details and impacted files
Files Coverage Δ
audinterface/core/segment.py 100.0% <100.0%> (ø)

@maxschmitt
Copy link
Contributor Author

$ pre-commit install
$ pre-commit run --all-files

thanks

maybe you are handing the wrong path to the processing function?

yes, I forgot root=root in the test for process_table()

This seems not to test if the labels are correctly assigned if the number of segments increases.

I added a test here ll. 333-.

Generally, handling all edge cases and ensuring that the dtype of each column is correctly transferred is a bit cumbersome. I'm not sure if there is a better solution than the one implemented.
process_table() ll. 577-

Otherwise, all tests are running without errors, now. @hagenw

@maxschmitt
Copy link
Contributor Author

8bd24aa resolved a bug for CategoricalDtype columns.

  File "xxx/audinterface/core/segment.py", line 596, in <dictcomp>
    col: labels[:, icol].astype(dtypes[icol])
TypeError: Cannot interpret 'CategoricalDtype(categories=['anger', 'boredom', 'disgust', 'fear', 'happiness',
                  'sadness', 'neutral'],
, ordered=False, categories_dtype=object)' as a data type

Copy link
Member

@hagenw hagenw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks for adding this.

Besides the comments I made, it would be nice to have a test that shows the new method behaves well if the segment function returns overlapping segments, e.g.

start,end
0,2
1,3
2,4
...

audinterface/core/segment.py Outdated Show resolved Hide resolved
audinterface/core/segment.py Outdated Show resolved Hide resolved
audinterface/core/segment.py Outdated Show resolved Hide resolved
audinterface/core/segment.py Show resolved Hide resolved
audinterface/core/segment.py Outdated Show resolved Hide resolved
audinterface/core/segment.py Outdated Show resolved Hide resolved
audinterface/core/segment.py Show resolved Hide resolved
docs/usage.rst Outdated Show resolved Hide resolved
docs/usage.rst Outdated Show resolved Hide resolved
tests/test_segment.py Outdated Show resolved Hide resolved
@maxschmitt
Copy link
Contributor Author

Besides the comments I made, it would be nice to have a test that shows the new method behaves well if the segment function returns overlapping segments, e.g.

This was now added by: e393dc8

All other suggestions were adopted.

Some tests are failing now (especially python 3.8, everything seems to be working for newer versions), but the errors are likely to originate from code that has not been touched.
@hagenw Do you know why it fails?

@hagenw
Copy link
Member

hagenw commented May 16, 2024

It's indeed interesting that the tests for Python 3.8 fail due to audb. As in https://github.com/audeering/audb the tests for Python 3.8 are not failing.

@hagenw
Copy link
Member

hagenw commented May 16, 2024

I can also not reproduce the failing test locally with Python 3.8.
My guess is that it has something to do with the audb cache we use during the tests.
The code for which audb fails is:

        try:
            deps = Dependencies()
            deps.load(cached_deps_file)
        except (AttributeError, FileNotFoundError, ValueError, EOFError):
            # If loading cached file fails, load again from backend
            backend_interface = utils.lookup_backend(name, version)
            deps = download_dependencies(backend_interface, name, version, verbose)
            # Store as pickle in cache
            deps.save(cached_deps_file)

So it seems we need to catch KeyError there as well.

@hagenw
Copy link
Member

hagenw commented May 16, 2024

I proposed a fix for audb in audeering/audb#411.

@hagenw
Copy link
Member

hagenw commented May 16, 2024

The test for Python 3.8 is now passing, but the test under Windows fails for the same reason as before the test under Python 3.8. We will discuss in audb how to better handle this.

@maxschmitt
Copy link
Contributor Author

Now, only the tests on Windows are failing due to pyarrow:
https://github.com/audeering/audinterface/actions/runs/9094046905/job/25054683122?pr=172

@hagenw
Copy link
Member

hagenw commented May 16, 2024

Yes, but the reason is basically the same as before under Python 3.8, see audeering/audb#411 (comment).

So we need again to fix it in audb.

Copy link
Member

@hagenw hagenw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the audb error is related to the cache, I managed to get the tests pass by deleting the existing cache and re-running the tests. The underlying problem, and how to solve it in audb is discussed in audeering/audb#413.

I have updated the description of the pull request by adding two screenshots of the new documentation, as it is always helpful to have the description as documentation on what was added by this pull request. Otherwise, I have just one other suggestion, and we should be fine to go here.

docs/usage.rst Outdated Show resolved Hide resolved
Co-authored-by: Hagen Wierstorf <[email protected]>
@maxschmitt
Copy link
Contributor Author

The Windows test (at least for Python 3.9) is still failing. Can we delete the cache also for this one?

@hagenw
Copy link
Member

hagenw commented May 17, 2024

As I understood it, it should have started with creating a new cache already. Which means for some reason the new audb can no longer be shared between Windows and other platforms.

I could delete the cache and start the pipeline again, but then we would most likely need to do that again after merging.

If you don't need this feature next week, I would propose to postpone until we have found a better solution in audb.

@hagenw
Copy link
Member

hagenw commented May 17, 2024

Just for the record:

@maxschmitt
Copy link
Contributor Author

If you don't need this feature next week, I would propose to postpone until we have found a better solution in audb.

Makes sense, no rush!

@hagenw
Copy link
Member

hagenw commented Jun 4, 2024

Good news, with audb 1.7.3 all tests are passing.

@hagenw hagenw changed the title Process table Add audinterface.Segment.process_table() Jun 4, 2024
Copy link
Member

@hagenw hagenw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All fine for merging here.

@maxschmitt maxschmitt merged commit f094aaf into main Jun 4, 2024
16 checks passed
@maxschmitt maxschmitt deleted the process_table branch June 4, 2024 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants