Releases: abstractqqq/polars_ds_extension
Releases · abstractqqq/polars_ds_extension
v0.6.2
Deprecations and Updates
- Renamed a few expressions. Mostly now the "query_" prefix is no longer used for many features, but is kept for some others. E.g. instead of "query_lstsq", now the expression is "pds.lin_reg".
What's Changed
- added matthew corr coef by @abstractqqq in #276
- general maintenance by @abstractqqq in #277
- added auto_corr and ar_coeffs by @abstractqqq in #278
- added lcs_seq by @abstractqqq in #279
- Update polars by @abstractqqq in #280
- added basic splitting by @abstractqqq in #281
- fixes #282 by @abstractqqq in #283
- Misc features + Isotonic Rgression by @abstractqqq in #284
- Add monotonic checks by @abstractqqq in #285
Full Changelog: v0.6.1...v0.6.2
v0.6.1
What's Changed
- fixed bug in pca by @abstractqqq in #268
- added seed option for perturb and jitter by @abstractqqq in #269
- added multi target lr as an expression by @abstractqqq in #270
- added weights in lstsq plot by @abstractqqq in #272
- Simple lstsq by @abstractqqq in #273
- Fix bug in online lr by @abstractqqq in #274
Full Changelog: v0.6.0...v0.6.1
v0.6.0
Breaking
Dropped Python 3.8 support.
What's Changed
- added elastic net by @abstractqqq in #252
- try notebook in CI by @abstractqqq in #253
- Code reorg by @abstractqqq in #254
- simplified fft convolve by @abstractqqq in #255
- less allocation in fft convolve by @abstractqqq in #256
- added the most basic fairness metrics by @abstractqqq in #257
- Misc readme by @abstractqqq in #258
- Pykdtree by @abstractqqq in #259
- added docs on spatial by @abstractqqq in #260
- added elastic net model by @abstractqqq in #261
- Use Altair instead of plotly as plot backend by @abstractqqq in #266
New Contributors
- @AtlasPilotPuppy made their first contribution in #263
Full Changelog: v0.5.3...v0.6.0
v0.5.3
Breaking
- Lstsq queries not doesn't take
method
keyword. Whether it is ridge/lasso is determined by the l1_reg and l2_reg parameters.
Deprecation Plans
- v0.5.3 will be the last v0.5 versions. Polars < v1 support will be dropped for v0.6 onwards.
What's Changed
- Lstsq ux by @abstractqqq in #235
- Mase by @abstractqqq in #236
- Wls by @abstractqqq in #238
- removed unncessary options from kdt by @abstractqqq in #239
- Higher dim perf by @abstractqqq in #240
- added poly features by @abstractqqq in #241
- Faster kdtree construction by @abstractqqq in #242
- better default behavior for encoders by @abstractqqq in #243
- Less pointers in kdt by @abstractqqq in #244
- slightly improved knn str by @abstractqqq in #245
- added solver methods, refactored linear regression py class by @abstractqqq in #246
- Query sim cnt by @abstractqqq in #247
- Online lr 2 by @abstractqqq in #248
- normal lstsq with rcond by @abstractqqq in #249
- fixed legacy bugs in examples by @abstractqqq in #250
- added discrete entropy as a shorthand by @abstractqqq in #251
Full Changelog: v0.5.2...v0.5.3
v0.5.2
What's Changed
- added rank hot encoder | fixed a bug in the initial values of recursive lstsq by @abstractqqq in #223
- run bench by @abstractqqq in #224
- recursive rolling l2 by @abstractqqq in #226
- added filters in pipeline by @abstractqqq in #228
- added scriptable steps by @abstractqqq in #229
- Skip null policy by @abstractqqq in #232
- KNN queries now work correctly in .over() context by @abstractqqq in #232
- added knn freq cnts by @abstractqqq in #233
Full Changelog: v0.5.1...v0.5.2
v0.5.1
What's Changed
- L1 regression by @abstractqqq in #203
- Fix ridge regression by @abstractqqq in #204
- new readme by @abstractqqq in #205
- added epsilon for knn by @abstractqqq in #206
- fixed some typos by @abstractqqq in #207
- faster l1 l2 by @abstractqqq in #208
- removed a version of kdt that is not performant by @abstractqqq in #209
- added linear model class by @abstractqqq in #212
- added shrink_dtype in pipeline by @abstractqqq in #213
- Recursive linear reg by @abstractqqq in #214
- reorg features by @abstractqqq in #215
- fixed a pathing issue by @abstractqqq in #216
- added max bound option for knn by @abstractqqq in #217
- fix_typing by @abstractqqq in #218
- Initial pytest-bench setup by @CangyuanLi in #219
- Null policy by @abstractqqq in #220
- Rolling lr by @abstractqqq in #221
- Knn regression by @abstractqqq in #222
Full Changelog: v0.5.0...v0.5.1
v0.5.0
What's Changed
- better extreme case handling for roc auc by @abstractqqq in #187
- Mann whitney stats by @abstractqqq in #189
- updated polars ver by @abstractqqq in #190
- added polars v1 support by @abstractqqq in #191
- Refactor longest streak by @abstractqqq in #192
- added impute_nan by @abstractqqq in #194
- added bicor (biweight midcorrelation) by @abstractqqq in #195
- added eager execution on series by @abstractqqq in #197
- Kdt refactor. All kd tree related queries should be at least 10-20% faster now. by @abstractqqq in #198
- removed modules that don't really belong by @abstractqqq in #199
- removed unneeded rust dependency by @abstractqqq in #200
- Added L2 (ridge) regression option by @abstractqqq in #201
v0.5 Near Term Goals
- Pipeline feature parity with categorical_encoder and Scikit-learn, with some exceptions
- More thorough testing with v1
- Uncaught Bugs / simple feature requests
- Interpolations
v0.5 + Longer Term Goals
- Lasso regression. Rolling regression.
- Stand alone models (linear regression and Kdtree)
- K-means / K-medoids
Full Changelog: v0.4.6...v0.5.0
v0.4.6
Warning
Support for polars=1.0 alpha, beta is not fully tested!
What's Changed
- Expand target types in encoders by @abstractqqq in #166
- added custom transforms by @abstractqqq in #167
- Gaussian noise by @abstractqqq in #168
- simplied target in pipeline by @abstractqqq in #169
- features in, out checks in pipeline, c3 stats, CID CE by @abstractqqq in #170
- add query_confusion_matrix by @CangyuanLi in #171
- better binary confusion matrix calculation by @abstractqqq in #172
- added funding by @abstractqqq in #173
- added winsorizing by @abstractqqq in #175
- added discord by @abstractqqq in #177
- fixed a argument name inconsistency due to polars version by @abstractqqq in #179
- added json serialization for pipeline by @abstractqqq in #181
- added some time series related statistics by @abstractqqq in #182
- Debug linear impute by @abstractqqq in #184
- better profile by @abstractqqq in #185
- added docs by @abstractqqq in #186
Full Changelog: v0.4.5...v0.4.6
v0.4.5
Breaking Changes
Previously, if you want to compare Edit distance between one column and a single string, you would do
df.select(
pds.str_leven(pl.col("c"), "word")
)
But now you have to do
df.select(
pds.str_leven(pl.col("c"), pl.lit("word"))
)
The previous behavior will now look for a column named "word" instead of using the word "word."
What's Changed
- refactored psi and add benford law by @abstractqqq in #158
- Skip some rows in KNN by @DGolubets in #157
- Added data_mask in knn_ptwise by @abstractqqq in #161
- Pretty Code Snippets by @s1lvester in #162
- Tests for entrophies by @abstractqqq in #163
- Woe encoding by @abstractqqq in #164
- Added Median Absolute Deviation and renamed str2 to string by @abstractqqq in #165
New Contributors
- @DGolubets made their first contribution in #157
- @s1lvester made their first contribution in #162
Full Changelog: v0.4.4...v0.4.5
v0.4.4
What's Changed
- added par option in convolve by @abstractqqq in #146
- More dia plots by @abstractqqq in #147
- fixed corr method bug by @abstractqqq in #148
- Add string pre-preprocessing code by @CangyuanLi in #150
- String Cleaning by @CangyuanLi in #152
- Pipeline by @abstractqqq in #155
New Contributors
- @CangyuanLi made their first contribution in #150
Full Changelog: v0.4.3...v0.4.4