Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[FSTORE-1285] Model Dependent Transformation Functions (#1308)
* hopsworks_udf first version * working code for running hopsworks udf without saving in backend using python client * removing debugging logs * statistics working with python client * basic functionality working with backend * code with statistics working and saved to backend * working code for feature vector * reformatted and documented Hopswork UDF class * unit tests for transformation functions * clearning transformations engine and adding unit tests * feature view api formated * reformatting and fixing feature_view_engine * reformatted and added unit tests for feature view * updating documentation for feature store * updating documentation for feature store * fixed tests for training datatset features * reformatted and added unit tests for python engine * most unit tests fixed * all unit tests working * removed print * adding test for hopsworks_udf * correcting merge for vector server * reformatting with ruff * fixing vector server * fixing docs * fixing vector server * fixing building in transformations * correcting get feature vector * adding missed changes for build in transformations * shallow copying scope dictonary to not overwrite statistics variable for different udf's having same statistics parameter name * adding deep copy to create multiple transfromation functions with different features * sorting transformation function to maintain consistent order * sorting transformation functions in transformation function engine to mainatin same order * using feature view transformation functions * addressing review comments * using PYARROW_EXTENSION_ENABLE during import rather than as a function * skiping transformation function test in windows spark udf failing due to dependencies with greater expectation * changing transformed_feature_vector_col_name to transformed_features to obtain feature names after transfromations * adding property transformed_features in feature view to obtain feature names after transfromations * updating doc string and adding property decorator missed during rebase * refactoring transformation functions to update parsing of statistics parameters and also renaming decorator name * refactoring transformation functions to update parsing of statistics parameters and also renaming decorator name * reformating with ruff * adding statistics to udf only if required * convrting extended statistics to dictonary * sorting built in label encoder to maintain consistency * adding type hints for class TransformationStatistics * adapating to backend update of reaturning output_types, transformation_features and statistics_argument_names as Lists * fixing unit tests * removign space in doc string * replace - from output column names with _ * revreting unwanted spark test _ replace changes * adding missed import * correcting to_dict feature view * reverting python.py unintentional changes during rebase * rebase and adding back missed import
- Loading branch information