Skip to content

Releases: Netflix/metaflow

2.12.0

28 May 09:09
5f57997
Compare
Choose a tag to compare

Features

Support running flows in notebooks and through Python scripts

This release introduces a new Runner API that makes it simple to run flows inside Notebooks, or as part of Python code.

Read the blog post on the feature, or dive straight into the documentation to start using it.

What's Changed

Full Changelog: 2.11.16...2.12.0

2.11.16

22 May 00:07
0654b85
Compare
Choose a tag to compare

Features

Support GCP Secret Manager

This release adds support for using GCP Secret manager to supply secret values for steps environment.

In order to enable the secret manager, you should specify the type by setting METAFLOW_DEFAULT_SECRETS_BACKEND_TYPE to gcp-secret-manager or specifying it in the decorator

@secrets(sources=[{"type": "gcp-secret-manager", "id": "some-secret-key"}])

METAFLOW_GCP_SECRET_MANAGER_PREFIX can be set in order to not have to write full secret locations.

Support Azure Key Vault

This release also adds support for Azure Key Vault as a secrets backend. Specify az-key-vault as the secret backend type to use this.

Same as with the other secret managers, we provide a prefix config to avoid having to repeat common parts in the secret keys. Configure this by setting METAFLOW_AZURE_KEY_VAULT_PREFIX

Note: Currently only Secret object types are supported when using Azure Key Vault.

@parallel for Kubernetes

This release adds support for @parallel when flows are run --with kubernetes

Example:

@step
def start(self):
    self.next(self.parallel_step, num_parallel=3)

@kubernetes(cpu=1, memory=512)
@parallel
@step
def parallel_step(self):
...

Configurable runtime limits

It is now possible to configure the default timeout for the @timeout decorator. This can be done by setting METAFLOW_DEFAULT_RUNTIME_LIMIT in the environment, or in a config.json

Improvements

Resumed flows should record task competions correctly

Fixes an issue where tasks that were cloned from a previous run by resume would not show up as completed on the Metaflow UI due to missing metadata

Fix accessing task index of a foreach task

There was an issue accessing the index of a foreach task via the client. With this release it is possible to do the following

from metaflow import Task
task = Task("ForeachFlow/123/foreach_step/task-00000000")
task.index

What's Changed

New Contributors

Full Changelog: 2.11.15...2.11.16

2.11.15

08 May 22:22
8b026c2
Compare
Choose a tag to compare

Features

Displaying task attempt logs

When running a task with the @retry decorator, previously we were able to only view the logs of the latest attempt of the task.
With this release it is now possible to target a specific attempt with the --attempt option for the logs command

python example.py logs 123/retry_step/456 --attempt 2

Scrubbing log contents

This release introduces a new command for scrubbing log contents of tasks in case they contain sensitive information that needs to be redacted.

Simplest use case is scrubbing the latest task logs. By default both stdout and stderr are scrubbed

python example.py logs scrub 123/example/456

There are also options to target only a specific log stream

python example.py logs scrub 123/example/456 --stderr
python example.py logs scrub 123/example/456 --stdout

when using the@retry decorator, tasks can have multiple attempts with separate logs that require scrubbing. By default only the latest attempt is scrubbed. There are options to make scrubbing multiple attempts easier

# scrub specific attempt
python example.py logs scrub 123/retry_step/456 --attempt 1

# scrub all attempts
python example.py logs scrub 123/retry_step/456 --all

# scrub specified attempt and all prior to it (this would scrub attempts 0,1,2,3)
python example.py logs scrub 123/retry_step/456 --all --attempt 3

The command also accepts only specifying a step for scrubbing. This is useful for steps with multiple tasks, like a foreach split.

python example.py logs scrub 123/foreach_step

all the above options also apply when targeting a step for scrubbing.

Note: Log scrubbing for running tasks is not recommended, and is actively protected against. There can be occasions where a task has failed in such a way that it still counts as not completed. For such a case you can supply the --include-not-done option to try and scrub it as well.

What's Changed

New Contributors

Full Changelog: 2.11.14...2.11.15

2.11.14

06 May 23:29
e4d96b4
Compare
Choose a tag to compare

What's Changed

Full Changelog: 2.11.13...2.11.14

2.11.13

06 May 21:57
5e05540
Compare
Choose a tag to compare

Features

Configurable default Kubernetes resources

This release introduces configuration options for setting default values for cpu / memory / disk when running on Kubernetes. These can be set either with environment variables

METAFLOW_KUBERNETES_CPU=
METAFLOW_KUBERNETES_MEMORY=
METAFLOW_KUBERNETES_DISK=

or in a Metaflow profile

{
  "KUBERNETES_CPU": "",
  "KUBERNETES_MEMORY": "",
  "KUBERNETES_DISK": "",
}

These values will be overruled by specifying a value through the @kubernetes or @resources decorators.

Improvements

Support for wider foreach flows with Argo Workflows

This release changes the way task ids are generated on Argo Workflows in order to solve an issue where extremely wide foreach splits could not execute correctly due to hard limits on input parameters size on Argo Workflows.

What's Changed

New Contributors

Full Changelog: 2.11.12...2.11.13

2.11.12

03 May 19:59
9989bc6
Compare
Choose a tag to compare

What's Changed

  • Fix: JSON Reference Path Error in AWS Step Functions Distributed Map by @nidhinnru in #1822
  • fix import of the new escape hatch flag by @wangchy27 in #1823

New Contributors

Full Changelog: 2.11.11...2.11.12

2.11.11

02 May 21:39
302f057
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 2.11.10...2.11.11

2.11.10

11 Apr 21:08
5908c4e
Compare
Choose a tag to compare

Improvements

Argo Events trigger improvements for parameters with default values

This release fixes an issue where partial or empty argo event payloads would incorrectly overwrite the default values for the parameters of a triggered flow.

For example a flow with

@trigger(events=["params_event"])
class DefaultParamEventFlow(FlowSpec):

    param_a = Parameter(
        name="param_a",
        default="default value A",
    )

    param_b = Parameter(
        name="param_b",
        default="default value B",
    )

will now correctly have the default values for its parameters when triggered by

from metaflow.integrations import ArgoEvent
ArgoEvent('params_event').publish()

or a default value for param_b and the supplied value for param_a when triggered by

ArgoEvent('params_event').publish({"param_a": "custom-value"})

What's Changed

Full Changelog: 2.11.9...2.11.10

2.11.9

29 Mar 18:23
bae5c10
Compare
Choose a tag to compare

What's Changed

Full Changelog: 2.11.8...2.11.9

2.11.8

29 Mar 10:22
e577781
Compare
Choose a tag to compare

What's Changed

Full Changelog: 2.11.7...2.11.8