-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FSTORE-1064] Improve docs for spine groups #1137
base: master
Are you sure you want to change the base?
Conversation
@jimdowling - you need to format the changes with |
select label from the left feature group, so that we don't need to provide a spine | ||
for online serving. | ||
!!! note | ||
Spine Groups are not currently supported for online serving. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is misleading. Spine feature groups have no meaning in serving as they represent labels and the online feature store only store the most recent version of the data.
Instead of using a feature group to save a label/prediction target, you can use a spine together with a dataframe containing the labels and join keys for features in other feature groups. | ||
A Spine is essentially a metadata object similar to a feature group, however, its data is not stored in the feature store. | ||
The Spine stored in the feature store only contains the needed metadata such as the name, version, primary key column(s), and event time column. | ||
The Spine DataFrame is provided when you need to (1) create training data and (2) create batch inference data. The Spine DataFrame should also contain any join keys (primary keys to other feature groups) needed to join features included in a feature view containing the Spine group.If you don’t include the event_time in the Spine DataFrame (such as in batch inference), it will retrieve the latest feature value for that feature using the join key(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about the last part? If you don’t include the event_time in the Spine DataFrame (such as in batch inference), it will retrieve the latest feature value for that feature using the join key(s).
I don't think that's the case. if you don't provide the event_time, I think Hopsworks falls back to non-pit join and it doesn't guarantee time ordering. You most likely end up with multiple rows for each pk in the spine feature group (one for each event in the joined feature group).
This PR adds/fixes/changes...
JIRA Issue: -
https://hopsworks.atlassian.net/browse/FSTORE-1064
Priority for Review: -
Related PRs: -
How Has This Been Tested?
Checklist For The Assigned Reviewer: