All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
This release introduces UDWF and RemoteTables functionality.
- Reparametrize caching into storage and invalidation strategy by @dlovell in #278
- Update trinodb/trino docker tag to v464 by @renovate[bot] in #326
- Use vars instead of secrets by @mesejo in #330
- Update bitnami/minio docker tag to v2024.10.29 by @renovate[bot] in #327
- Update dependency ruff to v0.7.2 by @renovate[bot] in #331
- Use env variables in workflows by @mesejo in #332
- Update dependency quartodoc to ^0.7.2 || ^0.9.0 by @renovate[bot] in #333
- Update dependency ruff to v0.7.3 by @renovate[bot] in #338
- Update bitnami/minio docker tag to v2024.11.7 by @renovate[bot] in #337
- Udate python version in pyproject by @mesejo in #351
- Update dependency coverage to v7.6.5 by @renovate[bot] in #349
- Update codecov/codecov-action action to v5 by @renovate[bot] in #350
- Update postgres docker tag to v17.1 by @renovate[bot] in #352
- Update dependency pyarrow to v18 by @renovate[bot] in #319
- Update dependency ibis-framework to v9.4.0 by @renovate[bot] in #145
- Update dependency connectorx to v0.4.0 by @renovate[bot] in #334
- RemoteTable bug by @mesejo in #335
Fix dependencies issues (adbc-driver-postgresql not installed) in release of 0.1.8
- Make git indifferent to changes in use nix/flake by @dlovell in #305
- Update dependency ruff to v0.7.1 by @renovate[bot] in #311
- Ci check for proper package installation by @mesejo in #318
- Check proper installation of examples extras by @mesejo in #320
- Use letsql execute/to_pyarrow/to_pyarrow_batches by @mesejo in #316
- Update dependency pytest-cov to v6 by @renovate[bot] in #322
- Update trinodb/trino docker tag to v463 by @renovate[bot] in #321
- Update dependency snowflake-connector-python to v3.12.3 [security] by @renovate[bot] in #312
- Synchronize dependencies by @dlovell in #317
Some major changes were introduced in this version the most important removing the need for registering expressions for execution, updating to datafusion 42, as well as removing heavy rust dependencies such as candle.
- Update dependency ruff to v0.6.4 by @renovate[bot] in #258
- Update changelog command by @mesejo in #259
- Update dependency ruff to v0.6.5 by @renovate[bot] in #265
- Update actions/create-github-app-token action to v1.11.0 by @renovate[bot] in #263
- Update dependency ruff to v0.6.8 by @renovate[bot] in #273
- Disable test_examples temporarily by @mesejo in #284
- Update dependency coverage to v7.6.3 by @renovate[bot] in #283
- Update dependency ruff to v0.6.9 by @renovate[bot] in #285
- Update dependency black to v24.10.0 by @renovate[bot] in #287
- Update codecov/codecov-action action to v4.6.0 by @renovate[bot] in #286
- Update to datafusion v42 by @mesejo in #293
- Update dependency pre-commit to v4 by @renovate[bot] in #291
- Update tests and workflows by @mesejo in #299
- Only run ruff on repo files by @dlovell in #301
- Set postgres env vars by @mesejo in #303
- Update dependency ruff to v0.7.0 by @renovate[bot] in #302
- Fix pre-release workflow by @mesejo in #257
- Update dependency fsspec to v2024.9.0 by @renovate[bot] in #255
- Update dependency datafusion to v40 by @renovate[bot] in #226
- Update rust crate arrow-ord to v53 by @renovate[bot] in #251
- Enable build on macos by @dlovell in #260
- Enable build on macos by @dlovell in #262
- Update rust crate datafusion-common to v42 by @renovate[bot] in #269
- Fix
nix run
issues re SSL and macos temp user dirs by @dlovell - Fix
nix run
issues re IPYTHONDIR by @dlovell in #264 - Docs deployment by @mesejo in #294
- Remove the requirement of table registration for expr execution by @dlovell in #209
- Remove segment_anything by @mesejo in #295
- Remove tensor functions by @mesejo in #297
In this release, the segment_anything function has been refactored and cleaned up for improved performance and maintainability. The output of segment_anything has also been modified to return the mask and iou_score. Additionally, support for reading CSV files from HTTP sources has been added, along with basic S3 support, enhancing the data ingestion capabilities of the project.
- Update dependency ruff to v0.6.3 by @renovate[bot] in #242
- Refactor and clean segment anything function by @mesejo in #243
- Reading from csv in HTTP, add basic s3 support by @mesejo in #230
- Change output of segment_anything to mask and iou_score by @mesejo in #244
- Bump quinn-proto from 0.11.6 to 0.11.8 by @dependabot[bot] in #249
- Update actions/create-github-app-token action to v1.10.4 by @renovate[bot] in #253
- Bump cryptography from 43.0.0 to 43.0.1 by @dependabot[bot] in #254
This update includes new workflows for testing Snowflake and S3, a dependency update for ruff, and several fixes addressing PyPI release issues, in-memory table registration, and Dask version compatibility.
- Add workflow for testing snowflake by @mesejo in #233
- Add ci workflow for testing s3 by @mesejo in #235
- Update dependency ruff to v0.6.2 by @renovate[bot] in #229
- Issues with release to pypi by @mesejo in #228
- Registration of in-memory tables by @mesejo in #232
- Improve snowflake workflow by @mesejo in #234
- Checkout PR ref by @mesejo in #236
- Fix dask version by @mesejo in #237
The library has seen a lot of active development, with numerous new features and improvements added in various pull requests:
- New functionality, such as a pyarrow-based UDAF, postgres and sqlite readers, image/array manipulation functions, and xgboost prediction functions, have been added.
- Existing functionality has been enhanced by wrapping ibis backends, updating dependencies, and improving the build/testing process.
- Numerous dependency updates have been made to keep the library up-to-date.
- Some bug fixes and stability improvements have been implemented as well.
- Add pyarrow udaf based on PyAggregator by @mesejo in #108
- Add unit tests based on workflow diagram by @mesejo in #110
- Add postgres read_parquet by @mesejo in #118
- Add wrapper for snowflake backend by @mesejo in #119
- Add read_sqlite and read_postgres by @mesejo in #120
- Add ibis udf and model registration method by @hussainsultan in #182
- Add udf signature and return a partial with model_name by @hussainsultan in #195
- Add image and array manipulation functions by @mesejo in #181
- Add example predict_xgb.py by @dlovell in #213
- Add connectors for using environment variables or fixed examples server by @dlovell in #217
- Add workflow for testing library only dependencies by @mesejo in #223
- Add duckdb and xgboost as dependencies for examples by @mesejo in #216
- Wrap ibis backends by @mesejo in #115
- Unpin pyarrow version by @mesejo in #121
- Update README by @mesejo in #125
- Use options.backend as ParquetCacheStorage's default backend by @mesejo in #123
- Change to publish on release by @mesejo in #122
- Configure Renovate by @renovate[bot] in #124
- Update dependency black to v24 [security] by @renovate[bot] in #126
- Update dependency pure-eval to v0.2.3 by @renovate[bot] in #130
- Update dependency blackdoc to v0.3.9 by @renovate[bot] in #128
- Update dependency pytest to v7.4.4 by @renovate[bot] in #131
- Update actions/create-github-app-token action to v1.10.3 by @renovate[bot] in #127
- Update dependency connectorx to v0.3.3 by @renovate[bot] in #129
- Update dependency snowflake/snowflake-connector-python to v3.11.0 by @renovate[bot] in #141
- Update dependency importlib-metadata to v8.1.0 by @renovate[bot] in #139
- Update dependency ruff to v0.5.4 by @renovate[bot] in #133
- Update dependency black to v24.4.2 by @renovate[bot] in #136
- Update dependency sqlalchemy to v2.0.31 by @renovate[bot] in #134
- Update codecov/codecov-action action to v4.5.0 by @renovate[bot] in #135
- Update dependency codespell to v2.3.0 by @renovate[bot] in #137
- Update dependency coverage to v7.6.0 by @renovate[bot] in #138
- Update dependency sqlglot to v23.17.0 by @renovate[bot] in #142
- Update dependency pre-commit to v3.7.1 by @renovate[bot] in #140
- Update dependency structlog to v24.4.0 by @renovate[bot] in #143
- Update actions/checkout action to v4 by @renovate[bot] in #148
- Update actions/setup-python action to v5 by @renovate[bot] in #149
- Update dependency datafusion/datafusion to v39 by @renovate[bot] in #150
- Update dependency numpy to v2 by @renovate[bot] in #152
- Update dependency duckb/duckdb to v1 by @renovate[bot] in #151
- Update dependency pyarrow to v17 by @renovate[bot] in #153
- Disable pip_requirements manager by @mesejo in #163
- Update dependency pytest-cov to v5 by @renovate[bot] in #159
- Update extractions/setup-just action to v2 by @renovate[bot] in #161
- Update github artifact actions to v4 by @renovate[bot] in #162
- Range for datafusion-common by @renovate[bot] in #166
- Update dependency pytest to v8 by @renovate[bot] in #158
- Update dependencies ranges by @mesejo in #172
- Enable plugin development for backends by @mesejo in #132
- Include pre-commit dependencies in renovatebot scan by @mesejo in #176
- Update dependency ruff to v0.5.5 by @renovate[bot] in #174
- Bump object_store from 0.10.1 to 0.10.2 by @dependabot[bot] in #175
- Update dependency pre-commit to v3.8.0 by @renovate[bot] in #178
- Lock file maintenance, update Cargo TOML by @renovate[bot] in #179
- Refactor flake by @dlovell in #180
- Use poetry2nix overlays by @dlovell
- Enable editable install by @dlovell
- Update dependency ruff to v0.5.6 by @renovate[bot] in #183
- Update dependency coverage to v7.6.1 by @renovate[bot] in #187
- Lock file maintenance by @renovate[bot] in #188
- Collapse ifs by @dlovell
- Enable
nix run
to drop into an ipython shell by @dlovell - Make key_prefix settable in config/CacheStorage by @dlovell in #196
- Update dependency ruff to v0.5.7 by @renovate[bot] in #197
- Bump aiohttp from 3.9.5 to 3.10.2 by @dependabot[bot] in #212
- Lock file maintenance by @renovate[bot] in #207
- Return wrapper with model_name partialized by @hussainsultan
- Update links to data files by @mesejo in #214
- Update dependency ruff to v0.6.0 by @renovate[bot] in #215
- Update gbdt-rs repo url by @mesejo in #220
- Make gbdt-rs dependency unambiguous by @mesejo in #222
- Use postgres.connect_examples() and TemporaryDirectory by @mesejo in #219
- Update dependency ruff to v0.6.1 by @renovate[bot] in #218
- Register cache tables when executing to_pyarrow by @mesejo in #114
- Update dependency fsspec to v2024.6.1 by @renovate[bot] in #144
- Update rust crate pyo3 to 0.21 by @renovate[bot] in #146
- Update tokio-prost monorepo to 0.13.1 by @renovate[bot] in #147
- Update rust crate datafusion range to v40 by @renovate[bot] in #165
- Update rust crate datafusion-* to v40 by @renovate[bot] in #167
- Widen dependency dask range to v2024 by @renovate[bot] in #164
- Enable build on macos by @dlovell
- Conditionally include libiconv in maturinOverride by @dlovell
- Update dependency attrs to v24 by @renovate[bot] in #185
- Return proper type in get_log_path by @dlovell
- Use pandas backend in SourceStorage by @mesejo
- Update rust crate datafusion to v41 by @renovate[bot] in #203
- Remove warnings and deprecated palmerpenguins package by @mesejo in #113
- Remove so that the udf keeps its metadata by @hussainsultan in #198
- @renovate[bot] made their first contribution in #218
- @dependabot[bot] made their first contribution in #212
- Api letsql api methods by @mesejo in #105
- Prepare for release 0.1.4 by @mesejo in #107
- 0.1.4 by @mesejo in #109
- Add docker start to ci-test by @mesejo
- Poetry: add poetry checks to .pre-commit-config.yaml by @dlovell
- Add source cache by default by @mesejo
- Test_cache: add test_parquet_cache_storage by @dlovell
- Add rust files by @dlovell
- Add new cases to DataFusionBackend.register by @dlovell
- Add client tests for new register types by @dlovell
- Add faster function for CachedNode removal by @mesejo
- Add optimizations for predict_xgb in datafusion by @mesejo in #16
- Lint: add args to poetry pre-commit invocation by @dlovell in #20
- Add TableProvider for ibis Table by @mesejo in #21
- Add filter pushdown for ibis.Table TableProvider by @mesejo in #24
- Add .sql implementation by @mesejo in #28
- Add automatic testing for examples dir by @mesejo in #45
- Add docs by @mesejo in #51
- Add better snowflake caching by @dlovell in #49
- Add docs-preview workflow by @mesejo in #54
- Add missing extras to poetry install in docs workflow by @mesejo in #58
- Add start of services to workflow by @mesejo in #59
- Add docs deploy workflow by @mesejo in #55
- Add array functions by @mesejo in #60
- Add registering of arbitrary expressions by @mesejo in #64
- Add generic functions by @mesejo in #66
- Add hashing of duckdb parquet files by @mesejo in #67
- Add numeric functions by @mesejo in #80
- Add
ls
accessor for Expr by @dlovell in #81 - Add greatest and least functions by @mesejo in #98
- Add temporal functions by @mesejo in #99
- Add StructColumn and StructField ops by @mesejo in #102
- Add SnapshotStorage by @dlovell in #103
- Improve performance and ux of predict_xgb by @mesejo
- Improve performance and ux of predict_xgb by @mesejo in #8
- Fetch only the required features for the model by @mesejo
- Fetch only the required features for the model by @mesejo in #9
- Organize the letsql package by @mesejo
- Lint by @dlovell
- Define CacheStorage with deterministic hashing for keys by @mesejo
- Define KEY_PREFIX to identify letsql cache by @dlovell
- Conftest: define expected_tables, enforce test fixture table list by @dlovell
- Lint by @dlovell
- Update poetry.lock by @dlovell
- Enable registration of pyarrow.RecordBatchReader and ir.Expr by @mesejo in #13
- Update CONTRIBUTING.md with instructions to run Postgres by @mesejo
- Register more dask normalize_token types by @dlovell in #17
- Enable flake to work on both linux and macos by @dlovell in #18
- Clean up development and ci/cd workflows by @mesejo in #19
- Temporal readme by @mesejo
- Publish test coverage by @mesejo in #31
- Update project files README, CHANGELOG and pyproject.toml by @mesejo in #30
- Expose TableProvider trait in Python by @mesejo in #29
- Clear warnings, bump up datafusion version to 37.1.0 by @mesejo in #33
- Update ibis version by @mesejo in #34
- Xgboost is being deprecated by @hussainsultan in #40
- Drop connection handling by @mesejo in #36
- Refactor _register_and_transform_cache_tables by @mesejo in #44
- Improve postgres table caching / cache invalidation by @dlovell in #47
- Make engines optional extras by @dlovell in #50
- SourceStorage: special case for cross-source caching by @dlovell in #63
- Problem with multi-engine execution by @mesejo in #70
- Clean test_execute and move tests from test_isolated_execution by @mesejo in #79
- Move cache related tests to test_cache.py by @mesejo in #88
- Give ParquetCacheStorage a default path by @dlovell in #92
- Update to datafusion version 39.0.0 by @mesejo in #97
- Make cache default path configurable by @mesejo in #101
- V0.1.3 by @mesejo in #106
- Filter bug solved by @mesejo
- Set stable ibis dependency by @mesejo
- Failing ci by @mesejo
- Pyproject: specify rev when using git ref, don't use [email protected] by @dlovell
- Pyproject: make pyarrow,datafusion core dependencies by @dlovell
- Run
poetry lock --no-update
by @dlovell - Use _load_into_cache in _put by @mesejo
- _cached: special case for name == "datafusion" by @dlovell
- ParquetCacheStorage: properly create cache dir by @dlovell
- Local cache with parquet storage by @mesejo
- Fix mac build with missing source files by @hussainsultan
- Allow for multiple execution of letsql tables by @mesejo in #41
- Fix import order using ruff by @mesejo in #37
- Mismatched table names causing table not found error by @mesejo in #43
- Ensure nonnull-ability of columns works by @dlovell in #53
- Explicitly install poetry-plugin-export per warning message by @dlovell in #61
- Update make_native_op to replace tables by @mesejo in #75
- Normalize_memory_databasetable: incrementally tokenize RecordBatchs by @dlovell in #73
- Cannot create table by @mesejo in #74
- Handle case of table names during con.register by @mesejo in #77
- Use sqlglot to generate escaped name string by @dlovell in #85
- Register table on caching nodes by @mesejo in #87
- Ensure snowflake tables have their Namespace bound on creation by @dlovell in #91
- Change name of parameter in replace_table function by @mesejo in #94
- Return native_dts, not sources by @dlovell in #95
- Displace offsets in TimestampBucket by @mesejo in #104
- Pyproject: remove redundant and conflicting dependency specifications by @dlovell
- Remove macos test suite by @mesejo
- Remove optimizer.py by @mesejo in #14
- Remove redundant item setting _sources on registering the cache nodes by @mesejo in #90
- Add missing dependencies by @mesejo
- Add CONTRIBUTING.md
- Address problems with schema
- Nix: add flake.nix and related files by @dlovell
- Add db package for showing predict udf working by @mesejo
- Add db package for showing predict udf working by @mesejo in #1
- Remove xgboost as dependency by @mesejo
- Add register and client functions
- Add testing of api
- Add isnan/isinf and fix offset
- Add udf support
- Add new string ops, remove typo
- Test array, temporal, string and udf
- Start adding wrapper
- Prepare for release