Releases · huggingface/huggingface_hub

15 Jun 15:53

v0.8.1

1da0161

v0.8.1: lazy loading, git-aware cache file layout, new create_commit

Git-aware cache file layout

v0.8.1 introduces a new way of caching files from the Hugging Face Hub, to two methods: snapshot_download and hf_hub_download.
The new approach is extensively documented in the Documenting files guide and we recommend checking it out to get a better understanding of how caching works.

New git-aware cache file layout by @julien-c in #801

New `create_commit` API

A new create_commit API allows users to upload and delete several files at once using HTTP-based methods. You can read more about it in this guide. The following convenience methods were also introduced:

upload_folder: Allows uploading a local directory to a repo.
delete_file allows deleting a single file from a repo.

upload_file now uses create_commit under the hood.

create_commit also allows creating pull requests with a create_pr=True flag.

None of the methods rely on Git locally.

New create_commit API by @SBrandeis in #888

Lazy loading

All modules will now be lazy-loaded. This should drastically reduce the time it takes to import huggingface_hub as it will no longer load all soft dependencies.

ENH lazy load modules in the root init by @adrinjalali in #874

Improvements and bugfixes

Add request ID to all requests by @LysandreJik in #909
Remove deprecations by @LysandreJik in #910
FIX Avoid creating repository when it exists on remote by @merveenoyan in #900
🏗 Use hub-ci for tests by @SBrandeis in #898
Refine 404 errors by @LysandreJik in #878
Fix typo by @lsb in #902
FIX metadata_update: work on a copy of the upstream file, to not mess up the cache by @julien-c in #891
ENH Removed history writing in Keras model card by @merveenoyan in #876
CI enable codecov by @adrinjalali in #893
MNT deprecate imports from snapshot_download by @adrinjalali in #880
Pushback deprecation for v0.7 release by @LysandreJik in #882
FIX make import machinary private by @adrinjalali in #879
ENH Keras Use table instead of dictionary for hyperparameters in model card by @merveenoyan in #877
Invert deprecation for create_repo in #912
Constant was accidentally removed during deprecation transition in #913

Contributors

lsb, julien-c, and 4 other contributors

Assets 2

30 May 12:18

LysandreJik

v0.7.0

3491bf4

v0.7.0: Repocard metadata

Repocard metadata

This PR adds a metadata_update function that allows the user to update the metadata in a repository on the hub. The function accepts a dict with metadata (following the same pattern as the YAML in the README) and behaves as follows for all top level fields except model-index.

Examples:

Starting from

existing_results = [{
    'dataset': {'name': 'IMDb', 'type': 'imdb'},
    'metrics': [{'name': 'Accuracy', 'type': 'accuracy', 'value': 0.995}],
     'task': {'name': 'Text Classification', 'type': 'text-classification'}
}]

1. Overwrite existing metric value in existing result

new_results = deepcopy(existing_results)
new_results[0]["metrics"][0]["value"] = 0.999
_update_metadata_model_index(existing_results, new_results, overwrite=True)

[{'dataset': {'name': 'IMDb', 'type': 'imdb'},
  'metrics': [{'name': 'Accuracy', 'type': 'accuracy', 'value': 0.999}],
  'task': {'name': 'Text Classification', 'type': 'text-classification'}}]

2. Add new metric to existing result

new_results = deepcopy(existing_results)
new_results[0]["metrics"][0]["name"] = "Recall"
new_results[0]["metrics"][0]["type"] = "recall"

[{'dataset': {'name': 'IMDb', 'type': 'imdb'},
  'metrics': [{'name': 'Accuracy', 'type': 'accuracy', 'value': 0.995},
              {'name': 'Recall', 'type': 'recall', 'value': 0.995}],
  'task': {'name': 'Text Classification', 'type': 'text-classification'}}]

3. Add new result

new_results = deepcopy(existing_results)
new_results[0]["dataset"] = {'name': 'IMDb-2', 'type': 'imdb_2'}

[{'dataset': {'name': 'IMDb', 'type': 'imdb'},
  'metrics': [{'name': 'Accuracy', 'type': 'accuracy', 'value': 0.995}],
  'task': {'name': 'Text Classification', 'type': 'text-classification'}},
 {'dataset': ({'name': 'IMDb-2', 'type': 'imdb_2'},),
  'metrics': [{'name': 'Accuracy', 'type': 'accuracy', 'value': 0.995}],
  'task': {'name': 'Text Classification', 'type': 'text-classification'}}]

ENH Add update metadata to repocard by @lvwerra in #844

Improvements and bug fixes

Keras: Saving history in a JSON file by @merveenoyan in #861
space after uri by @leondz in #866

Contributors

leondz, lvwerra, and merveenoyan

Assets 2

09 May 20:11

LysandreJik

v0.6.0

a8b6f14

v0.6.0: fastai support, binary file support, skip LFS files when pushing to the hub

Disclaimer: This release was initially released with advertised support for #844. It was not released in this release and will be in v0.7.

fastai support

v0.6.0 introduces downstream (download) and upstream (upload) support for the fastai libraries. It supports fastai versions above 2.4.
The integration is detailed in the following blog.

Add fastai upstream and downstream capacities for fastai>=2.4 and fastcore>=1.3.27 versions by @omarespejel in #678

Automatic binary file tracking in `Repository`

Binary files are now rejected by default by the Hub. v0.6.0 introduces automatic binary file tracking through the auto_lfs_track argument of the Repository.git_add method. It also introduces the Repository.auto_track_binary_files method which can be used independently of other methods.

ENH Auto track binary files in Repository by @LysandreJik in #828

`skip_lfs_file` is now added to mixins

The parameter skip_lfs_files is now added to the different mixins. This will enable pushing files to the hub without first downloading the files above 10MB. This should drammatically reduce the time needed when updating a modelcard, a configuration file, and others.

✨ add skip_lfs_files to mixins' push_to_hub by @nateraw in #858

Keras support improvement

The support for Keras model is greatly improved through several additions:

The save_pretrained_keras method now accepts a list of tags that will automatically be added to the repository.
Download statistics are now available on Keras models

Introducing list of tags to Keras model card by @merveenoyan in #806
Enable keras download stats by @merveenoyan in #860

Bugfixes and improvements

FIX don't raise if name/organizaiton are passed postionally by @adrinjalali in #822
ENH Use provided token from HUGGING_FACE_HUB_TOKEN env variable if available by @FrancescoSaverioZuppichini in #794
tests(hf_api): remove infectionTypes field by @McPatate in #834
Remove docs, tasks and inference API from huggingface_hub by @osanseviero in #833
FEAT Uniformize hf_api a bit and add support for Spaces by @julien-c in #792
Add a bug report template by @osanseviero in #832
clean up formatting by @stevhliu in #839
Release guide by @LysandreJik in #820
Fix keras test by @osanseviero in #855
DOC Add quick start guide by @stevhliu in #850
MNT refactor: subprocess.run -> run_subprocess by @LysandreJik in #352
MNT enable preview on black by @adrinjalali in #849
Update how to guides by @stevhliu in #840
Update contribution guide for merging PRs by @stevhliu in #856
DOC Update landing page by @stevhliu in #854
space after uri by @leondz in #866

Contributors

leondz, julien-c, and 9 other contributors

Assets 2

07 Apr 19:10

LysandreJik

v0.5.1

f81a0e9

v0.5.1: Patch release

This is a patch release fixing a breaking backward compatibility issue.

Linked PR: #822

Assets 2

07 Apr 19:09

LysandreJik

v0.5.0

2f65b70

v0.5.0: Reference documentation, Keras improvements, stabilizing the API

Documentation

Version v0.5.0 is the first version which features an API reference. It is still a work in progress with features lacking, some images not rendering, and a documentation reorg coming up, but should already provide significantly simpler access to the huggingface_hub API.

The documentation is visible here.

API reference documentation by @LysandreJik in #782
[API Reference docs] Remove git references from GitHub Action templates by @LysandreJik in #813
DOC API docstring improvements by @adrinjalali in #731

Model & datasets list improvements

The list_models and list_datasets methods have been improved in several ways.

List private models

These two methods now accept the token keyword to specify your token. Specifying the token will include your private models and datasets in the returned list.

Support list_models and list_datasets with token arg by @muellerzr in #638

Modelcard metadata

These two methods now accept the cardData boolean argument. If set to True, the modelcard metadata will also be returned when using these two methods.

Include cardData in list_models and list_datasets by @muellerzr in #639

Filtering by carbon emissions

The list_models method now also accepts an emissions_trehsholds parameter to filter by carbon emissions.

Enable filtering by carbon emission by @muellerzr in #668

Keras improvements

The Keras serialization and upload methods have been worked on to provide better support for models:

All parameters are now included in the saved model when using push_to_hub_keras
log_dir parameter for TensorBoard logs, which will automatically spawn a TensorBoard instance on the Hub.
Automatic model card

Introduce include_optimizer parameter to push_to_hub_keras() by @merveenoyan in #616
Add TensorBoard for Keras models by @merveenoyan in #651
Create Automatic Keras model card by @merveenoyan in #679
Allow TensorBoard Override for same Repository by @merveenoyan in #709
Add tempfile for tensorboard logs in tensorboard tests in test_keras_integration.py by @merveenoyan in #761

Contributing guide

A contributing guide is now available for the huggingface_hub repository. For any and all information related to contributing to the repository, please check it out!

Read more about it here: CONTRIBUTING.md.

Pre-commit hooks

The huggingface_hub GitHub repository has several checks to ensure that the code respects code quality standards. Opt-in pre-commit hooks have been added in order to make it simpler for contributors to leverage them.

Read more about it in the aforementionned CONTRIBUTING guide.

MNT Add pre-commit hooks by @adrinjalali in #807

Renaming and transferring repositories

Repositories can now be renamed and transferred programmatically using move_repo.

Allow renaming and transferring repos programmatically by @osanseviero in #704

Breaking changes & deprecation

⛔ The following methods have now been removed following a deprecation cycle

`list_repos_objs`

The list_repos_objs and the accompanying CLI utility huggingface-cli repo ls-files have been removed.
The same can be done using the model_info and dataset_info methods.

Remove deprecated list_repos_objs and huggingface-cli repo ls-files by @julien-c in #702

Python 3.6

Python 3.6 support is now dropped as end of life. Using Python 3.6 and installing huggingface_hub will result in version v0.4.0 being installed.

CI support python 3.7-3.10 - remove 3.6 support by @adrinjalali in #790

⚠️ Items below are now deprecated and will be removed in a future version

API deprecate positional args in file_download and hf_api by @adrinjalali in #745
MNT deprecate name and organization in favor of repo_id by @adrinjalali in #733

What's Changed

Include "model" in repo_type to keep consistency by @muellerzr in #620
Hotfix for repo_type by @muellerzr in #623
fix: typo in docstring by @ariG23498 in #647
{upload|delete}_file: Remove client-side filename validation by @SBrandeis in #669
Ensure post_method is only executed once by @sgugger in #676
Remove paying subscription mention from docstring by @cakiki in #653
Improve tests and logging by @muellerzr in #682
docs(links): Update settings/token to settings/tokens by @ronvoluted in #699
Add support for private hub by @juliensimon in #703
Add retry_endpoint for test stability by @osanseviero in #719
FIX fix a bug in _filter_emissions to accept numbers w/o decimal and dict emissions by @adrinjalali in #753
Logging fix for hf_api, logging documentation by @LysandreJik in #748
Contributing guide & code of conduct by @LysandreJik in #692
Fix pytorch and tensorflow python matrix by @osanseviero in #760
MNT add links to related projects and the forum on issue template by @adrinjalali in #773
Note on the README by @LysandreJik in #772
Remove autoreviewers by @muellerzr in #793
CI Error on FutureWarning by @adrinjalali in #787
MNT more informative message on error in Hf.Api.delete_repo by @adrinjalali in #783
Add security status by @McPatate in #654
Remove redundant part of security test by @osanseviero in #802
Changed test repository names to fix tests by @merveenoyan in #803
TST calling delete_repo under tempfile for fixing the test by @merveenoyan in #804
Disable logging in with organization token by @merveenoyan in #780
MNT change dev version to 0.5, 0.4 is already released by @adrinjalali in #810
👨‍💻 Configure HF Hub URL with environment variable by @SBrandeis in #815
MNT support oder requests versions by @adrinjalali in #817
Rename the env variable HF_ENDPOINT. by @Narsil in #819

New Contributors

@McPatate made their first contribution in #583
@FremyCompany made their first contribution in #606
@simoninithomas made their first contribution in #633
@mlonaws made their first contribution in #630
@ariG23498 made their first contribution in #647
@J-Petiot made their first contribution in #660
@ronvoluted made their first contribution in #699
@juliensimon made their first contribution in #703
@allendorf made their first contribution in #742
@frgfm made their first contribution in #747
@hbredin made their first contribution in #688

Full Changelog: v0.4.0...v0.5.0

Contributors

Narsil, julien-c, and 19 other contributors

Assets 2

26 Jan 18:30

LysandreJik

v0.4.0

735b82e

v0.4.0: Tag listing, Namespace Objects, Model Filter

Tag listing

Introduce Tag Listing by @muellerzr in #537

This PR introduces the ability to fetch all available tags for models or datasets and returns them as a nested namespace object, for example:

>>> from huggingface_hub import HfApi

>>> api = HfApi() 
>>> tags = api.get_model_tags()
>>> print(tags)
Available Attributes:
 * benchmark
 * language_creators
 * languages
 * licenses
 * multilinguality
 * size_categories
 * task_categories
 * task_ids

>>> print(tags.benchmark)
Available Attributes:
 * raft
 * superb
 * test

Namespace objects

Namespace Objects for Search Parameters by @muellerzr in #556

With a goal of adding more tab-completion to the library, this PR introduces two objects:

DatasetSearchArguments
ModelSearchArguments

These two AttributeDictionary objects contain all the valid information we can extract from a model as tab-complete parameters. We also include the author_or_organization and dataset (or model) _name as well through careful string splitting.

Model Filter

Implement a Model Filter class by @muellerzr in #553

This PR introduces a new way to search the hub: the ModelFilter class.

It is a simple Enum at first to the user, allowing them to specify what they want to search for, such as:

f = ModelFilter(author="microsoft", model_name="wavlm-base-sd", framework="pytorch")

From there, they can pass in this filter to the new list_models_by_filter function in HfApi to search through it:

models = api.list_modes(filter=f)

The API may then be used for complex queries:

args = ModelSearchArguments()
f = ModelFilter(framework=[args.library.pytorch, args.library.TensorFlow], model_name="bert", tasks=[args.pipeline_tag.Summarization, args.pipeline_tag.TokenClassification])

api.list_models_from_filter(f)

Ignoring filenames in snapshot_download

This PR introduces a way to limit the files that will be fetched by the snapshot_download. This is useful when you want to download and cache an entire repository without using git, and that you want to skip files according to their filenames.

[Snapshot download] allow some filenames to be ignored by @patrickvonplaten in #566

What's Changed

[Hotfix][API] card_data => cardData on /api/datasets by @julien-c in #530
Fix the progress bars when cloning a repository by @LysandreJik in #517
Update Hugging Face Hub documentation README and Endpoints by @muellerzr in #527
Convert string functions to f-string by @muellerzr in #536
Fixing FS for espnet. by @Narsil in #542
[snapshot_download] upgrade to canonical separator by @julien-c in #545
Add test directions by @muellerzr in #547
[HOTFIX] Change test for missing_input to reflect back-end redirect changes by @muellerzr in #552
Bring consistency to download and upload APIs by @muellerzr in #574
Search by authors and string by @FrancescoSaverioZuppichini in #531
Quick typo by @muellerzr in #575

New Contributors

@kahne made their first contribution in #569
@FrancescoSaverioZuppichini made their first contribution in #531

Full Changelog: v0.2.1...v0.4.0

Contributors

Narsil, julien-c, and 5 other contributors

Assets 2

26 Jan 18:18

LysandreJik

v0.2.1

d4b2da8

v0.2.1: Patch release

This is a patch release fixing an issue with the notebook login.

5e2da9b#diff-fb1696cbcf008dd89dde5e8c1da9d4be5a8f7d809bc32f07d4453caba40df15f

Assets 2

26 Jan 18:17

LysandreJik

v0.2.0

c1ccbee

v0.2.0: Access tokens, skip large files, local files only

Access tokens

Version v0.2.0 introduces the access token compatibility with the hub. It offers the access tokens as the main login handler, with the possibility to still login with username/password when doing [Ctrl/CMD]+C on the login prompt:

The notebook login is adapted to work with the access tokens.

Skipping large files

The Repository class now has an additional parameter, skip_lfs_files, which allows cloning the repository while skipping the large file download.

#472

Local files only for `snapshot_download`

The snapshot_download method can now take local_files_only as a parameter to enable leveraging previously downloaded files.

#505

Assets 2

09 Nov 17:46

LysandreJik

v0.1.2

f31030e

v0.1.2: Patch release

What's Changed

clean_ok should be True by default by @LysandreJik in #462

Full Changelog: v0.1.1...v0.1.2

Contributors

LysandreJik

Assets 2

05 Nov 18:39

LysandreJik

v0.1.1

5eb5bfc

v0.1.1: Patch release

What's Changed

Fix typing-extensions minimum version by @lhoestq in #453
Fix argument order in create_repo for Repository.clone_from by @sgugger in #459

Full Changelog: v0.1.0...v0.1.1

Contributors

sgugger and lhoestq

Assets 2

Releases: huggingface/huggingface_hub

v0.8.1: lazy loading, git-aware cache file layout, new create_commit

Git-aware cache file layout

New create_commit API

Lazy loading

Improvements and bugfixes

Contributors

v0.7.0: Repocard metadata

Repocard metadata

1. Overwrite existing metric value in existing result

2. Add new metric to existing result

3. Add new result

Improvements and bug fixes

Contributors

v0.6.0: fastai support, binary file support, skip LFS files when pushing to the hub

fastai support

Automatic binary file tracking in Repository

skip_lfs_file is now added to mixins

Keras support improvement

Bugfixes and improvements

Contributors

v0.5.1: Patch release

v0.5.0: Reference documentation, Keras improvements, stabilizing the API

Documentation

Model & datasets list improvements

List private models

Modelcard metadata

Filtering by carbon emissions

Keras improvements

Contributing guide

Pre-commit hooks

Renaming and transferring repositories

Breaking changes & deprecation

list_repos_objs

Python 3.6

What's Changed

New Contributors

Contributors

v0.4.0: Tag listing, Namespace Objects, Model Filter

Tag listing

Namespace objects

Model Filter

Ignoring filenames in snapshot_download

What's Changed

New Contributors

Contributors

v0.2.1: Patch release

v0.2.0: Access tokens, skip large files, local files only

Access tokens

Skipping large files

Local files only for snapshot_download

v0.1.2: Patch release

What's Changed

Contributors

v0.1.1: Patch release

What's Changed

Contributors

New `create_commit` API

Automatic binary file tracking in `Repository`

`skip_lfs_file` is now added to mixins

`list_repos_objs`

Local files only for `snapshot_download`