Skip to content

Releases: zenml-io/zenml

0.71.0

06 Dec 08:21
f0d59c6
Compare
Choose a tag to compare

ZenML version 0.71.0 delivers a new Modal step operator integration as its core feature, enabling efficient cloud execution for ML pipelines with granular hardware configuration options. The release strengthens enterprise capabilities through improved token management and dashboard features while expanding artifact handling with dynamic naming and enhanced visualization support. Additionally, it includes various infrastructure improvements and bug fixes that enhance the platform's stability and usability, particularly around Docker connectivity, Kubernetes management, and service connector operations.

New Feature: Modal Step Operator Integration

ZenML now integrates with Modal, bringing lightning-fast cloud execution capabilities to your ML pipelines. This new step operator allows you to execute individual pipeline steps on Modal's specialized compute instances, offering notable speed, particularly for Docker image building and hardware provisioning. With simple configuration options, you can precisely specify hardware requirements like GPU type, CPU count, and memory for each step, making it ideal for resource-intensive ML workloads.

New Feature: AWS Image Builder

Don't want to worry about Docker locally? Now build images remotely in AWS. Docs: https://docs.zenml.io/stack-components/image-builders/aws

Other Highlights

  • Workload API Token Management: Refactored token management for improved security with a generic API token dispenser.
  • Dashboard Enhancements:
    • Introduced service account management capabilities.
    • Added API key creation and integration features.
  • Dynamic Artifact Naming: Introduced capability to dynamically name artifacts.
  • Visualization Enhancements: Made dictionaries and lists visualizable, added JSON visualization type.

Additional Features and Improvements

  • Improved error messages for Docker daemon connectivity
  • Enhanced SageMaker URL handling
  • Simplified model version artifact linkage
  • Added testing for pipeline templates
  • Improved Kubernetes pod and label length management
  • Allowed skipping type annotations for step inputs
  • Enabled using feature service instances instead of just names

Bug Fixes

  • Fixed issues with getting out of an inaccessible active stack
  • Fixed race conditions in the service connector type registry
  • Resolved migration test complications
  • Corrected documentation links
  • Fixed artifact store and artifact URI handling
  • Addressed various scalability and compatibility issues

Documentation Updates

  • Added documentation redirects
  • Updated PyTorch documentation links
  • Improved service connector documentation

What's Changed

Full Changelog: 0.70.0...0.71.0

0.70.0

12 Nov 17:36
496a0d5
Compare
Choose a tag to compare

The ZenML 0.70.0 release includes a significant number of database schema changes and migrations, which means upgrading to this version will require extra caution. As always, please make sure to make a copy of your production database before upgrading.

Key Changes

  • Artifact Versioning Improvements: The handling of artifact versions has been improved, including the API improvements like the ability to batch artifact version requests to improve the execution times and more types for the step input/output artifacts, including multiple versions of the same artifact (e.g. model checkpoints), to improve the UX using ZenML UI or while working directly with the API.
  • Scalability Enhancements: Various scalability improvements have been made, such as reducing unnecessary server requests and incrementing artifact versions server-side. These enhancements are expected to provide significant speed and scale improvements for ZenML users.
  • Metadata management: Now, all the metadata-creating functions are gathered under one method called log_metadata. It is possible to call this method with different inputs to log run metadata for artifact versions, model versions, steps, and runs.
  • The oneof filtering: This allows to filter entities using a new operator called oneof. You can use this with IDs (UUID type) or tags (or other string-typed attributes) like this PipelineRunFilter(tag='oneof:["cats", "dogs"]').
  • Documentation Improvements: The ZenML documentation has been restructured and expanded, including the addition of new sections on finetuning and LLM/ML engineering resources.
  • Bug Fixes: This release includes several bug fixes, including issues with in-process main module source loading, and more.

Caution: Make sure to back up your data before upgrading!

While this release brings many valuable improvements, the database schema changes and migrations pose a potential risk to users. It is strongly recommended that users:

  • Test the upgrade on a non-production environment: Before upgrading a production system, test the upgrade process in a non-production environment to identify and address any issues.
  • Back up your data: Ensure that you have a reliable backup of your ZenML data before attempting the upgrade.

What's Changed

New Contributors

Full Changelog: 0.68.1...0.70.0

0.68.1

28 Oct 14:03
2a2c154
Compare
Choose a tag to compare

Bug fixes

Fixes an issue with some partially cached pipelines running on remote orchestrators.

What's Changed

  • Remove unavailable upstream steps during cache precomputation by @schustmi in #3146

Full Changelog: 0.68.0...0.68.1

0.68.0

25 Oct 00:02
c8c3b12
Compare
Choose a tag to compare

Highlights

  • Stack Components on the Dashboard: We're bringing back stack components. With this release, you will get access to the list of your stack components on the ZenML dashboard. More functionality is going to follow in the next releases.
  • Client-Side Caching: Implemented client-side computation for cached steps, significantly reducing time and costs associated with remote orchestrator spin-up.
  • Streamlined Onboarding Process: Unified the starter and production setup into a single sequential flow, providing a more intuitive user experience.
  • BentoML Integration: Updated to version 1.3.5 with enhanced containerization support.
  • Artifact Management: Introduced register_artifact function enabling direct linking of existing data in the artifact store, particularly useful for tools like PyTorch-Lightning that manage their own checkpoints.
  • Enhanced Error Handling: Added Error Boundary to visualization components for improved reliability and user experience.

Additional Features and Improvements

  • Added multiple access points for deleting pipeline runs
  • Improved pipeline detail view functionality
  • Improved service account handling for Kaniko image builder

Breaking Changes and Deprecations

  • Discontinued Python 3.8 support
  • Removed legacy pipeline and step interface
  • Removed legacy post execution workflow
  • Removed legacy dashboard option
  • Removed zenml stack up/down CLI commands
  • Removed zenml deploy and zenml <stack-component> deploy
  • Removed StepEnvironment class
  • Removed the option to specify a specific model version for step output artifacts using the ArtifactConfig class
  • Removed the option to use the ExternalArtifact class to load an artifact from a model version
  • Removed Client.list_runs, replacing it with Client.list_pipeline_runs
  • Removed ArtifactVersionResponse.read, replacing it with ArtifactVersionResponse.load

Documentation Updates

Added new guides for the following topics:

  • Kubernetes per-pod configuration
  • Factory generation of artifact names
  • Common stacks best practices
  • Azure 1-click dashboard deployment
  • ZenML server upgrade best practices
  • Custom Dataset classes and Materializers
  • Comprehensive ZenML Pro documentation
  • Image building optimization during pipeline runs
  • Enhanced BentoML integration documentation

What's Changed

New Contributors

Full Changelog: 0.67.0...0.68.0

0.67.0

30 Sep 16:22
4e5ac88
Compare
Choose a tag to compare

Highlights

  • Improved Sagemaker Orchestrator: Now supports warm pools for AWS Sagemaker, enhancing performance and reducing startup times for TrainingJobs.
  • New DAG Visualizer: Shipped major enhancements to the DAG Visualizer for Pipeline Runs:
    • Preview of the actual DAG before pipeline completion
    • Visual adjustments for improved clarity
    • Real-time updates during pipeline execution
  • Environment Variable References in Configurations: Introduced the ability to reference environment variables in both code and configuration files using the syntax ${ENV_VARIABLE_NAME}, increasing flexibility in setups.
  • Enhanced UX for Major Cloud Providers: Displaying direct pipeline/log URL when working with major cloud platforms.
  • Skypilot with Kubernetes Support: Added compatibility for running Skypilot orchestrator on Kubernetes clusters.
  • Updated Deepchecks Integration: The Deepchecks integration has been refreshed with the latest features and improvements.

Features and Improvements

  • AWS Integration:
    • Added permissions to workflow to enable assuming AWS role.
    • Fixed expired credentials error when using the docker service connector.
  • Error Handling: Improved error messages for stack components of uninstalled integrations.
  • API Key Management: Added an option to write API keys to a file instead of using the CLI.

Pipeline Execution:

  • Implemented fixes for executing steps as single step pipelines.
  • Added filter option for templatable runs.
  • Added additional filtering options for pipeline runs.
  • MLflow Integration: Linked registered models in MLflow with the corresponding MLflow run.
  • Analytics: Added missing analytics event to improve user insights.

Documentation Updates

  • Updated documentation for various integrations including:
    • Lightning AI orchestrator
    • Kubeflow
    • Comet experiment tracker
    • Neptune
    • Hugging Face deployer
    • Weights & Biases (wandb)
  • Added documentation for run templates.
  • Fixed incorrect method name in Pigeon docs.
  • Various small documentation fixes and improvements.

Bug Fixes

  • Fixed YAML formatting issues.
  • Resolved RBAC issues for subpages in response models.
  • Fixed step output annotation in Discord test.
  • Addressed MLFlow integration requirements duplication.
  • Fixed Lightning orchestrator functionality.

What's Changed

New Contributors

Full Changelog: 0.66.0...0.67.0

0.66.0

09 Sep 18:32
e4d3fb6
Compare
Choose a tag to compare

New Features and Improvements

Python 3.12 support

This release adds support for Python 3.12, which means you can now develop your ZenML pipelines
with the latest python features.

Easier way to specify component settings

Before this release, settings for stack components had to be specified with both the component type
as well as the flavor. We simplified this and it is now possible to specify settings just using the
component type:

# Before
@pipeline(settings={"orchestrator.sagemaker": SagemakerOrchestratorSettings(...)})
def my_pipeline():
  ...

# Now
@pipeline(settings={"orchestrator": SagemakerOrchestratorSettings(...)})
def my_pipeline():
  ...

Breaking changes

  • In order to slim down the ZenML library, we removed the numpy and pandas libraries as dependencies of ZenML. If your
    code uses these libraries, you have to make sure they're installed in your local environment as well as the Docker images that
    get built to run your pipelines (Use DockerSettings.requirements or DockerSettings.required_integrations).

What's Changed

Full Changelog: 0.65.0...0.66.0

0.65.0

28 Aug 20:01
0b3294e
Compare
Choose a tag to compare

Important note for OSS users

In the latest release, in the onboarding flow, the first pipeline has been labeled python run.py --training-pipeline but rather should be simply python run.py only. If you see an error while trying to completely the onboarding , simply use the latter command instead!

New Features and Improvements

New Quickstart Experience

This example demonstrates how ZenML streamlines the transition of machine learning workflows from local environments to cloud-scale operations.

Run Single Step as a ZenML Pipeline

If you want to run just an individual step on your stack, you can simply call the step as you would with a normal Python function. ZenML will internally create a pipeline with just your step and run it on the active stack.

Other improvements and fixes

  • Updated AzureML Step Operator to work with SDKv2 and use Service Connectors
  • Added timestamps to log messages
  • Fixed issue with loading artifacts from the artifact store outside of the current active artifact store
  • Support of templated names for Model Version ({date} and {time} are currently supported placeholders)
  • run_with_accelerate step wrapper can be used as a Python Decorator on top of ZenML steps

Breaking changes

  • Workspace scoped POST endpoint full-stack was removed and merged with stacks POST endpoint
  • If you use 0.65.0 with any prior server version of ZenML it might lead to a situation where a Model Version is created for every step of the pipeline, while the Model class was only once configured on the pipeline level. This is considered an expected behavior and you should not use the mismatching versions of ZenML Client and Server, in general.
    Minimal example:
from zenml import step, pipeline, Model

@step
def step_1()->None:
    print("1")

@step
def step_2()->None:
    print("2")

@pipeline(model=Model(name="my_model"))
def my_pipeline()->None:
    step_1()
    step_2()

if __name__=="__main__":
    my_pipeline()

In this case on server versions prior to 0.65.0 you will get 2 Model Versions of my_model created: one in step_1 and one in step_2. Please upgrade your server to 0.65.0+ and you will get only one Model Version for the same code snippet.

What's Changed

New Contributors

Full Changelog: 0.64.0...0.65.0

0.64.0

08 Aug 19:03
835dea1
Compare
Choose a tag to compare

New Features and Improvements

Notebook Integration

ZenML now supports running steps defined in notebook cells with remote orchestrators and step operators. This feature enhances the development workflow by allowing seamless transition from experimentation to production.

Reduced Docker Builds with Code Uploads

We've introduced an option to upload code to the artifact store, enabling Docker build reuse. This feature can significantly speed up iteration, especially when working with remote stacks.

  • Default: Enabled
  • Configuration: To disable, set DockerSettings.allow_download_from_artifact_store=False
    for steps or pipelines
  • Benefits:
    • Faster development cycles
    • No need to register a code repository to reuse builds
    • Builds only occur when requirements or DockerSettings change
  • Documentation: Which files are built into the image

AzureML Orchestrator Support

ZenML now supports AzureML as an orchestrator, expanding our list of supported cloud platforms.

Terraform Modules

We've released new Terraform modules on the Hashicorp registry for provisioning complete MLOps stacks across major cloud providers.

  • Features:
    • Automate infrastructure setup for ZenML stack deployment
    • Handle registration of configurations to ZenML server
  • More Information: MLOps Terraform ZenML blog post

These updates aim to streamline the MLOps workflow, making it easier to develop, deploy, and manage machine learning pipelines with ZenML.

What's Changed

🥳 Community Contributions 🥳

We'd like to give a special thanks to @christianversloot who contributed to this release by bumping the mlflow version to 2.15.0

Full Changelog: 0.63.0...0.64.0

0.63.0

30 Jul 11:14
430953f
Compare
Choose a tag to compare

Moving forward from the last two releases, we have further improved the 1-click deployment tool and the stack wizard by adding support for Azure.

Moreover, we implemented a new step operator that allows you to run individual steps of your pipeline in Kubernetes pods.

Lastly, we have simplified our pipeline models by removing their versions. No migration is required but if you were using the API, please take out references to pipeline versions, and reference runs directly from now on.

What's Changed

New Contributors

Full Changelog: 0.62.0...0.63.0

0.62.0

16 Jul 15:44
00e7200
Compare
Choose a tag to compare

Building on top of the last release, this release adds a new and easy way to deploy a GCP ZenML stack from the dashboard and the CLI. Give it a try by going to the Stacks section in the dashboard or running the zenml stack deploy command! For more information on this new feature, please do check out the video and blog from our previous release.

We also updated our Hugging Face integration to support the automatic display of an embedded datasets preview pane in the ZenML Dashboard whenever you return a Dataset from a step. This was recently released by the Hugging Face datasets team and it allows you to easily visualize and inspect your data from the comfort of the dashboard.

What's Changed

New Contributors

Full Changelog: 0.61.0...0.62.0