-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backfil job fails when sharding on datetime[ns] #1291
Comments
We would need to see the logs to know what's going on. It doesn't look like it in your snippet, but are you also setting the |
Sure. Here are the logs. You can also replicate the behavior locally by running the snippet I shared above.
No, I'm not specifying an event time key. I didn't quite get what the implications of this are. In particular (1) does this refer to the time at which the entry into the feature group is made or can I specify it manually, (2) will it become a primary/unique key in the table, and (3) what does this mean for the table's sharding behavior. The last part (3) is particularly interesting to me. BigQuery has the ability to select which shards to read, which reduces data processed (thus cost) and query time (less data to load). Ideally I would like to re-create this behavior in Hopsworks, and my attempt to doing so was to use a |
Chances are that this is a user error on my part, though I couldn't work it out from the docs. Figured I'll ask here so that we can see if there is a way to improve the docs and/or if there is an issue.
I'm trying to create a sharded/partitioned feature group, which uses both a primary key and a partition key. While the feature group is created successfully, I can't seem to be able to insert data into it:
Here is a link to the (failed) backfill job: https://c.app.hopsworks.ai/p/16549/jobs/named/foo_1_offline_fg_backfill/executions
(I can also share the logs if necessary.)
The text was updated successfully, but these errors were encountered: