All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Added: a sampling rate of 24000 Hz as flavor
- Added: a sampling rate of 22050 Hz as flavor
- Added: support for Python 3.13 (without Artifactory backend)
- Added: support for Python 3.12 (without Artifactory backend)
- Changed: switch default repository
to
audb-public
, hosted on S3 - Changed:
audb.Repository.create_backend_interface()
andaudb.publish()
now raise aValueError
for a repository with non-registered backends, or an Artifactory backend under Python>=3.12 - Changed: skip non-registered backends without raising an error in all functions with read-only access to repositories
- Changed: simplify quickstart section of the documentation
- Changed: depend on
audbackend>=2.2.1
- Changed: depend on
audeer>=2.2.0
- Deprecated: a sampling rate of 22500 Hz as flavor
- Removed: file-system repository from default configuration
- Fixed: handle an empty configuration file
- Fixed: remove extra
"/"
at end of dataset names inaudb.available()
for S3 and Minio backends
- Added:
"s3"
as a registered backend name - Changed: depend on
audbackend>=2.2.0
- Changed: make Artifactory backend optional,
to allow importing
audb
in Python>=3.12 - Fixed: speedup
audb.available()
for S3 and Minio backends
- Added: support for repositories
on S3 and MinIO servers,
using the
minio
backend ofaudbackend
- Changed: depend on
audbackend>=2.1.0
- Added: pseudo-streaming support with
audb.stream()
, which returns the newaudb.DatabaseIterator
object. In each iteration it will load a few rows from a requested table and downloads corresponding media files - Added:
map
argument toaudb.load_table()
, which behaves identical to themap
argument ofaudformat.Database.get()
- Added:
pickle_tables
argument toaudb.load()
,audb.load_to()
andaudb.load_table()
with default value ofTrue
. It can be used to disable storing tables as pickle files in cache/root folder - Fixed:
audb.load_table()
now only loads additional misc tables, that are used as scheme labels inside the requested table, and not in the whole database
- Added: support for publishing tables as parquet files
- Changed: depend on
audeer >=2.1.0
- Changed: depend on
audformat >=1.2.0
- Changed: depend on
pandas >=2.1.0
- Fixed: update progress bar at least every second
in
audb.load()
,audb.load_attachment()
,audb.load_media()
,audb.load_tables()
,audb.load_to()
,audb.publish()
- Removed: support for Python 3.8
- Fixed: ensure correct data types
in dependency table
when loaded from a version in cache,
stored by
audb<=1.6.3
- Fixed: ensure correct data types in dependency table when loaded from cache
- Fixed: publishing an update of a database
when the previous version
was stored in cache
by an older version of
audb
- Fixed: loading of database attachments
when
audb.config.CACHE_ROOT
andaudb.config.SHARED_CACHE_ROOT
point to the same folder - Fixed: ensure
audb.versions()
does not fail when database is not available in a repository - Fixed: loading of dependency table from cache
when the previous version
was stored in cache
by a different
pandas
version
- Fixed: loading of dependency table from cache
under Python 3.8,
when stored by an older version of
audb
- Fixed: require
pandas>=2.0.1
forpyarrow
based data types
- Added: experimental support for text files as media files
- Added: dependency on
pyarrow
- Added:
audb.Repository.backend_registry
that maps repository names likeartifactory
to corresponding backend classes, e.g.audbackend.backend.Artifactory
- Added:
audb.Repository.register()
to add an entry toaudb.Repository.backend_registry
- Added:
audb.Repository.create_backend_interface()
returns a backend interface to access files in the repository - Changed: improve speed of loading dependency table to the cache. E.g. for version 1.0.0 of the database musan loading time is reduced by 35%
- Changed: improve speed of downloading a database to the cache. E.g. for version 1.0.0 of the database musan loading time is reduced by 40% when using 8 threads
- Changed: depend on
audbackend>=2.0.0
- Changed: dependency table dataframe
returned by
audb.Dependencies.__call__()
now usespyarrow
based data types - Changed: dependency table is now stored as a PARQUET file on the backend, instead as a CSV file
- Fixed:
audb.versions()
for non-existing repositories - Fixed: documentation of
audb.Repository.__eq__()
- Added:
audb.Dependencies.__eq__()
to compare two dependency tables - Fixed: let
audb.available()
skip incomplete datasets instead of raising an error
- Fixed: in
audb.publish()
updating of multi-file archives that have been published before the version given by theprevious_version
argument - Fixed: speed up most methods
of
audb.Dependencies
- Fixed: dtype of the index
of the data frame
returned by
audb.Dependencies.__call__()
is nowstring
instead ofobject
- Fixed:
audb.versions()
whenaudb.config.REPOSITORIES
includes non-existing Artifactory repositories or Artifactory repositories without read access
- Changed: depend on
audeer>=2.0.0
- Changed: speed up
audb.versions()
- Fixed:
pandas
deprecation warnings - Fixed: make documentation independent of the number of public datasets
- Fixed: accessing a database in any repository
listed after a repository with access restrictions
or a non-existing repository
in
audb.config.REPOSITORIES
- Added: support for new backend API
- Changed: depend on
audbackend>=1.0.0
- Added: BibTeX reference to README
- Fixed: link to Artifactory anonymous access in the documentation
- Fixed: enforce reproducible order of media files entries in dependency table during publication
- Changed: require
audeer>=1.20.0
- Fixed:
audb.load()
,audb.load_to()
,audb.load_media()
, andaudb.remove_media()
were failing withaudeer==1.20.0
under Windows
- Added: support loading and publishing
of database attachments
(
audformat.Attachment
) - Added:
audb.load_attachment()
to load a single attachment of a database - Added:
audb.info.attachments()
to return the attachments entry of a database header - Added:
attachments
argument toaudb.load()
to load only specific attachments of a database - Changed: raise
RuntimeError
inaudb.publish()
if the file extension of a media file contains uppercase letters - Changed: raise
RuntimeError
inaudb.publish()
if a table ID or attachment ID contains a character not in[A-Za-z0-9._-]
- Changed: raise
ValueError
inaudb.publish()
ifversion
orprevious_version
are not conform toaudeer.StrictVersion
- Changed: use emodb v1.4.1 for documentation examples
- Changed: require
audbackend<1.0.0
asaudbackend
will introduce breaking changes - Fixed: speed up
audb.load_to()
when loading databases with large tables usingonly_metadata=True
- Added: support for Python 3.10
- Added: document optional needed overwrite permissions
for
audb.publish()
when continuing a canceled publishing command - Changed: require
audbackend>=0.3.17
- Changed: split API documentation into sub-pages for each function
- Changed:
audb.load()
andaudb.load_to()
extract archives in the corresponding database folder inside theaudb
cache instead of the system-wide temporary folder
- Added: support for
audformat
's newly introduced misc tables - Added:
audb.info.misc_tables()
- Added:
load_tables=True
argument toaudb.info.header()
andaudb.info.schemes()
specifying if misc tables used as labels in a scheme should be downloaded - Changed: require
audformat >=0.15.2
- Changed: use version 1.3.0 of emodb in the documentation examples
- Removed: support for Python 3.7
- Added: lock cache folder with a lock file when modifying it
- Added:
verbose
argument toaudb.dependencies()
- Added:
audb.info.files()
- Added:
media
andtables
arguments to appropriate functions inaudb.info
sub-module - Added:
only_metadata
argument toaudb.load_to()
- Added:
audb.publish()
raisesValueError
ifprevious_version
is not smaller thanversion
- Changed:
audb.publish()
does not require unchanged media files to exists in database folder - Changed:
audb.load()
raisesValueError
if a table or media file is requested that is not part of the database - Fixed: add missing exceptions to docstrings
- Changed: use emodb v1.2.0 for examples and tests
- Changed: depend on
audobject>=0.5.0
- Changed: depend on
audformat>=0.14.0
- Changed: depend on
audeer>=1.18.0
- Fixed: depend on
audbackend>=0.3.15
to avoid the possibility of an error when requesting versions of a database - Fixed: add full Windows support and tests
- Fixed: only create tmp folder when needed in
audb.load()
- Removed:
include
/exclude
keyword arguments - Removed:
audb.get_default_cache_root()
- Fixed: make moving of local files Windows compatible
- Fixed: create folder tree more efficiently when loading to cache
- Changed: depend on
audformat>=0.13.3
- Fixed: conversion of pickle protocol 5 files to pickle protocol 4 in cache
- Added: more examples to the API docstrings
- Changed: depend on
audformat>=0.13.2
- Changed: use pickle protocol-4 for caching dependencies
- Fixed: small improvements to API documentation
- Fixed: speed up
audb.load_to()
storing of CSV files
- Fixed: build documentation inside the release process with Python 3.8
- Added: support for Python 3.9
- Added: store file duration of the database
in the duration cache of
audformat.Database
- Changed:
audb.publish()
now raises an error if a table contains duplicated index entries - Fixed: several speed ups when loading or publishing a database
- Fixed: the
root
attribute of the returned database object fromaudb.load_to()
does now point to the correct folder and not the temporal folder - Removed: support for Python 3.6
- Added:
name
argument toaudb.cached()
to limit search to given database name - Changed: speedup
audb.available()
by 100% - Changed: use
audiofile.duration(..., sloppy=True)
for estimating durations for dependency files - Fixed:
audb.cached()
for empty or missing shared cache
- Fixed: set
bit_depth
to0
instead ofNone
for non SND formats in the dependency table
- Fixed: store metadata in dependency table for non SND formats like MP3 and MP4 files
- Added: documentation sub-section on database duration info
- Fixed: made compatible with future versions of
pandas
- Fixed: missing
audb.Repository
documentation
- Fixed:
audb.load()
raises now error for wrong keyword argument - Fixed: look also in shared cache for partial loaded databases
- Fixed: version number shown in the documentation table of content
- Added: discussion of needed system packages for handling audio files in the documentation
- Changed: allow only to publish portable databases
- Fixed: macOS support by relying on new
audresample
version
- Added:
audb.load_media()
- Added:
audb.load_table()
- Added: documentation on how to configure access rights for shared cache folder
- Changed: speedup
audb.Dependencies
methods - Changed: speedup
audb.info
functions - Changed:
audb.info
uses cache as well - Changed: use emodb 1.1.1 in documentation
- Changed: depend on
audformat>=0.11.0
- Fixed: allow
audb.load()
to work offline if database is cached
- Fixed: update removal version of deprecated stuff to 1.2.0
- Added:
audb.Dependencies._remove()
- Changed:
audb.Dependencies
internally usespd.DataFrame
instead ofdict
- Changed: store dependencies with pickle to speed up loading
- Changed: versions of the same flavor share dependency file
- Changed: if possible
audb.load()
copies tables and media files from other versions in the cache - Changed:
audb.Dependencies._add_media()
is now private - Changed:
audb.Dependencies._add_meta()
is now private - Changed:
audb.Dependencies.is_removed
renamed toaudb.Dependencies.removed
- Fixed:
audb.load()
considers format when searching the cache - Fixed:
audb.load()
considers format when resolving missing media - Fixed:
audb.available()
correctly returns versions of the same database from multiple repositories - Fixed: add missing link to
emodb
example repository - Removed:
audb.Dependencies.data
- Changed:
audb.Dependencies.bit_depth()
now always returns an integer - Changed:
audb.Dependencies.channels()
now always returns an integer - Changed:
audb.Dependencies.duration()
now always returns a float - Changed:
audb.Dependencies.sampling_rate()
now always returns an integer - Fixed:
audb.info.duration()
for databases that contain files with a duration of 0s - Fixed: remove dependency to
fire
package
- Fixed: docstring of
audb.exists()
falsely claimed that it was not returning a boolean - Fixed: several typos in documentation
- Fixed: renamed
latest_only
argument ofaudb.available()
toonly_latest
as it was before
- Fixed: appearance of documentation TOC by requirering
docutils<0.17
- Added: first public release
- Added:
audb.info.author()
- Added:
audb.info.license()
- Added:
audb.info.license_url()
- Added:
audb.info.organization()
- Added:
audb.Dependencies.archives
property - Added: section on publication in the documentation
- Added: introduction texts to documentation
- Changed: raise error for conversion of non-supported format
- Changed:
audb.exists()
to return bool - Changed: rename
audb.lookup_repository()
toaudb.repository()
- Changed: one combined section on load in the documentation
- Fixed: data types in dataframe returned by
audb.cached()
- Fixed: support files stored in archives with nested folders
- Fixed: listing of cache entries
- Removed: command line interface
- Removed:
audb.cached_databases()
- Removed:
audb.define
module
- Added:
complete
column inaudb.cached()
- Added:
previous_version
argument toaudb.publish()
- Added: backward compatibility with
audb <0.90
- Changed: cache flavor path to name/version/flavor_id
- Changed: use open source releases of
audbackend
,audobject
, andaudresample
- Changed: require
audformat>=0.10.0
- Changed: rename
audb.load_original_to()
toaudb.load_to()
- Changed: shorten flavor ID in cache
- Changed: filter operations and
only_metadata
no longer part ofaudb.Flavor
- Deprecated:
include
andexcldue
arguments - Fixed: looking for latest version across repositories
- Fixed:
Flavor.destination
for nested paths - Fixed: allow for cross-backend dependencies for
audb.publish()
- Fixed:
audb.remove_media()
can now be called several times
- Changed: enforce
mixdown=False
for mono file flavors - Fixed: global config file was missing in PyPI package
- Added: configuration file
- Changed: use external package for backend implementations
- Added:
audb.Backend.latest_version()
- Added:
audb.Backend.create()
- Added:
audb.Backend.register()
- Added:
audb.lookup_repository()
- Added:
config.REPOSITORY_PUBLISH
- Fixed: update
fire
dependency - Fixed: remove
config.GROUP_ID
- Fixed: use
sphinx>=3.5.1
to fix inherited attributes in documentation
- Changed: define data types when reading dependency file
- Added:
data-provate-local
to the default repositories
- Fixed: CHANGELOG
- Added: initial release