-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure correct boolean dtype in misc table index #431
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review
The MR improves typing consistency using nullable dtypes not only for int, but consistently now also for bool.
I have run the tests, they all work fine for me. I will approve this, but I have a more general questions, that we had in different form about schemes in audb.Dependencies
quite recently:
Often the code currently uses if clauses. e.g. in common.to_pandas_dtype
or class attributes as in define.DataType
to handle typing information. From what I see more often in other environments (like e.g. sqlalchemy) is that they use dict like structures to do lookup of typing info.
While this post shows that the disassembly would favor lookup for performance reaseons, this would worry me not so much. However a dict-like implementation appears to be more idiomatic and more readable to me. In addition, lookup apparoaches allow for easier testing.
A minor aspect for me is that there are several implementations with of test_to_pandas_dtype
. They seem all to be using the underlying audformat.core.common.to_pandas_dtype
but this has confused me a little.
However these are things that should take too much space here. I will therefore approve without further ado. I can confirm that all the tests pass.
I created audeering/audb#420 as a suggestion how to improve the storage of definitions in |
Are you confused, because they are only testing |
Closes #430
This pull request adds extra tests to cover #430 and fixes it by ensuring
bool
dtype is always converted to"boolean"
dtype.The fix is done by incorporating the needed conversion into
audformat.core.utils._maybe_convert_int_dtype()
and renaming the function toaudformat.core.utils._maybe_convert_pandas_dtype()
The second example from #430 now produces the desired result:
Besides the actual changes, I also added or expanded docstrings.
Note: I also added tests for columns of a normal table, even though the issue was not occurring there before.