Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FSTORE-1285] Model Dependent Transformation Functions #1308

Merged
merged 56 commits into from
Jul 11, 2024

Conversation

manu-sj
Copy link
Contributor

@manu-sj manu-sj commented Apr 29, 2024

This PR implements model dependent transformation function as per the design log : https://hopsworks.atlassian.net/wiki/spaces/FST/pages/388464653/Model+Dependent+Transformation+Proposal+-+version+2

Changes Performed

  • Added decorator called hopsworks_udf that should be used to defined transformation functions.

  • Added possibility to obtain statistics within transformation functions.

  • Added possibility for one to many and many to many transformation functions.

  • Transformation functions executed as pandas_udf's in spark engine and regular pandas functions in python engine.

  • Added ability to attach to a feature view transformation functions that are not explicitly saved to the backend using the transformations API.

  • Removed attaching transformation function to training dataset features and attached transformation functions to feature view directly. Hence removed the DTO class transformation_functions_attached.

JIRA Issue: https://hopsworks.atlassian.net/browse/FSTORE-1285?atlOrigin=eyJpIjoiYjYwZDAyM2ZjZjNkNDU5MWEyNmRjN2MxZjNiZmNhZmEiLCJwIjoiaiJ9

Related PRs:

How Has This Been Tested?

  • Unit Tests
  • Integration Tests
  • Manual Tests on VM

Checklist For The Assigned Reviewer:

- [ ] Checked if merge conflicts with master exist
- [ ] Checked if stylechecks for Java and Python pass
- [ ] Checked if all docstrings were added and/or updated appropriately
- [ ] Ran spellcheck on docstring
- [ ] Checked if guides & concepts need to be updated
- [ ] Checked if naming conventions for parameters and variables were followed
- [ ] Checked if private methods are properly declared and used
- [ ] Checked if hard-to-understand areas of code are commented
- [ ] Checked if tests are effective
- [ ] Built and deployed changes on dev VM and tested manually
- [x] (Checked if all type annotations were added and/or updated appropriately)

@manu-sj manu-sj marked this pull request as ready for review May 14, 2024 07:33
@manu-sj manu-sj requested a review from kennethmhc May 14, 2024 14:40
@manu-sj manu-sj requested a review from davitbzh May 16, 2024 08:19
Copy link
Contributor

@kennethmhc kennethmhc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some general comments

python/hsfs/hopsworks_udf.py Outdated Show resolved Hide resolved
python/hsfs/feature_store.py Outdated Show resolved Hide resolved
python/hsfs/engine/spark.py Show resolved Hide resolved
python/hsfs/core/vector_server.py Outdated Show resolved Hide resolved
python/hsfs/feature_view.py Show resolved Hide resolved
python/hsfs/engine/python.py Outdated Show resolved Hide resolved
@manu-sj manu-sj force-pushed the FSTORE-1285 branch 2 times, most recently from 206a533 to 92324cf Compare May 28, 2024 06:25
Copy link
Contributor

@kennethmhc kennethmhc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@manu-sj manu-sj force-pushed the FSTORE-1285 branch 2 times, most recently from 149e0d2 to 50e944c Compare June 11, 2024 08:08
python/hsfs/transformation_statistics.py Outdated Show resolved Hide resolved
python/hsfs/transformation_statistics.py Outdated Show resolved Hide resolved
manu-sj added 27 commits July 10, 2024 15:27
…for different udf's having same statistics parameter name
…to obtain feature names after transfromations
…n_features and statistics_argument_names as Lists
@manu-sj manu-sj merged commit a4efba1 into logicalclocks:master Jul 11, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants