Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Store tables as PARQUET files (#419)
* Ensure correct boolean dtype in misc table index * Remove unneeded code * Use pyarrow to read CSV files * Start debugging * Continue debugging * Fix tests * Remove unneeded code * Improve code * Fix test for older pandas versions * Exclude benchmark folder from tests * Test other implementation * Remove support for Python 3.8 * Store tables as PARQUET * Cleanup code + Table.levels * Use dict for CSV dtype mappings * Rename helper function * Simplify code * Add helper function for CSV schema * Fix typo in docstring * Remove levels attribute * Merge stash * Remove levels from doctest output * Convert method to property * Add comment * Simplify code * Simplify code * Add test for md5sum of parquet file * Switch back to snappy compression * Fix linter * Store hash inside parquet file * Fix code coverage * Stay with CSV as default table format * Test pyarrow==15.0.2 * Test pyarrow==14.0.2 * Test pyarrow==13.0 * Test pyarrow==12.0 * Test pyarrow==11.0 * Test pyarrow==10.0 * Test pyarrow==10.0.1 * Require pyarrow>=10.0.1 * Test pandas<2.1.0 * Add explanations for requirements * Add test using minimum pip requirements * Fix alphabetical order of requirements * Enhance test matrix definition * Debug failing test * Test different hash method * Use different hashing approach * Require pandas>=2.2.0 and fix hashes * CI: re-enable all minimal requriements * Hashing algorithm to respect row order * Clean up tests * Fix minimum install of audiofile * Fix docstring of Table.load() * Fix docstring of Database.load() * Ensure correct order in time when storing tables * Simplify comment * Add docstring to _load_pickle() * Fix _save_parquet() docstring * Improve comment in _dataframe_hash() * Document arguments of test_table_update... * Relax test for table saving order * Update audformat/core/table.py Co-authored-by: ChristianGeng <[email protected]> * Revert "Update audformat/core/table.py" This reverts commit 3f21e3c. * Use numpy representation for hashing (#436) * Use numpy representation for hashing * Enable tests and require pandas>=1.4.1 * Use numpy<2.0 in minimum test * Skip doctests in minimum * Require pandas>=2.1.0 * Require numpy<=2.0.0 in minimum test * Remove print statements * Fix numpy<2.0.0 for minimum test * Remove max_rows argument * Simplify code * Use test class * CI: remove pyarrow from branch to start test --------- Co-authored-by: ChristianGeng <[email protected]>
- Loading branch information