You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The goal is to use Snowflake’s Model Registry to store and deploy the pygam model, leveraging Snowflake's Snowpark and ML features. However, issues arise when registering the model due to compatibility constraints with the external package pygam.
Here is a minimal example that I tried to run in a Snowflake Notebook (I was able to use pygam via stage packages within Snowflake notebook):
# Import necessary Snowpark and Snowflake ML modulesfromsnowflake.snowpark.contextimportget_active_sessionsession=get_active_session()
importnumpyasnpimportpandasaspdfrompygamimportLinearGAM, sfromsnowflake.ml.modelimportcustom_modelfromsnowflake.ml.modelimportmodel_signaturefromsnowflake.ml.registryimportRegistry# Step 1: Generate synthetic data for model trainingnp.random.seed(0)
X=np.linspace(0, 10, 100).reshape(-1, 1) # 100 samples, single featurey=np.sin(X).ravel() +np.random.normal(scale=0.1, size=X.shape[0]) # Noisy sine wave data# Step 2: Train a simple `pygam` model with smoothing on the synthetic datapygam_model=LinearGAM(s(0)).fit(X, y)
# Step 3: Test the model by making predictions on new dataX_test=np.linspace(0, 10, 10).reshape(-1, 1) # New test datapredictions=pygam_model.predict(X_test)
print("Predictions on new data:", predictions) # Expected output for verification# Step 4: Define a Custom Model class to wrap `pygam` in SnowflakeclassPyGAMModel(custom_model.CustomModel):
@custom_model.inference_apidefpredict(self, X: pd.DataFrame) ->pd.DataFrame:
model_output=self.context["models"].predict(X)
returnmodel_output# Step 5: Create the Model Context and pass in the trained `pygam` modelmc=custom_model.ModelContext(
models=pygam_model
)
# Instantiate the Custom Modelpygam_model=PyGAMModel(mc)
# Test prediction in the custom model contextoutput_pd=pygam_model.predict(X_test)
print("Custom Model Prediction Output:", output_pd)
# Step 6: Register the Model in the Snowflake Model Registryregistry=Registry(
session=session,
database_name="YOUR_DATABASE", # Replace with your databaseschema_name="YOUR_SCHEMA", # Replace with your schema
)
# Attempt to log the modelregistry.log_model(
model=pygam_model,
model_name="pygam_model",
version_name="v1",
sample_input_data=X_test,
comment="Test deployment of a pygam model as a Custom Model"
)
I tried conda_dependencies, pip_requirements, ext_modules & code_paths without success. Support for arbitrary pip package installation (beyond the Snowflake Anaconda Channel) in the Snowflake model registry would significantly improve the flexibility of deploying custom models with niche or specialized packages, like pygam. Is there any solution to this problem that is currently available? This is a big show stopper for us in migrating to use Snowflake's ML features.
The text was updated successfully, but these errors were encountered:
benleit
changed the title
Register custom_model with custom python package pygam
Register custom_model with custom python package pygamNov 12, 2024
First of all, you are trying to define a custom model of pygam because this is not supported in Snowflake conda channel. To make it running in warehouse, you may package the library as long all of its dependencies are available in Snowflake conda channel. If so, there are 2 things to note:
Please make sure to include all the dependencies as your model dependency
Entire pygam python code needs to be packaged with the model itself (MODEL object must be self-contained and cannot refer to Stage).
Then comes confusion of CustomModel API:
3. All the objects mentioned in models attribute of ModelContext must be a known model to registry. In this case, pygam is not known to Registry. Thus you get the error of assert handler is not None (that is we do not know the handler).
With that, here is what I would suggest:
a. follow (1)
b. instead of (2), try to pickle the model including the pygam library as value (not reference) thus pickle file will include all python modules needed/referred.
c. Pass the pickle file in artifacts attribute of context. Then in __init__ load the pickle file.
As you can see, there are lots of caveats. We have an alternative to that warehouse by running the model in SPCS. See https://docs.snowflake.com/en/developer-guide/snowflake-ml/model-registry/container . On SPCS, pip is supported natively. You do not need to take care if (1), (b) can only take care of serializing the model (either via pickle or some native save/load API if pygam supports).
The goal is to use Snowflake’s Model Registry to store and deploy the pygam model, leveraging Snowflake's Snowpark and ML features. However, issues arise when registering the model due to compatibility constraints with the external package
pygam
.Here is a minimal example that I tried to run in a Snowflake Notebook (I was able to use
pygam
via stage packages within Snowflake notebook):This is the error message I get:
I tried
conda_dependencies
,pip_requirements
,ext_modules
&code_paths
without success. Support for arbitrary pip package installation (beyond the Snowflake Anaconda Channel) in the Snowflake model registry would significantly improve the flexibility of deploying custom models with niche or specialized packages, likepygam
. Is there any solution to this problem that is currently available? This is a big show stopper for us in migrating to use Snowflake's ML features.The text was updated successfully, but these errors were encountered: