Releases: Netflix/metaflow
2.9.15
Improvements
Improve the performance of parallel_map
We now check for processes in the order in which they complete not in the order in which they are launched. This also increases the likelihood of failing fast.
Fix issues with the environment escape mechanism
Deadlocks and errors could occur when using the environment escape mechanism in two cases: (a) GC would occur at an inopportune moment or (b) subprocesses were involved. Both issues were fixed.
What's Changed
- Fix two possible issues with the environment escape communication by @romain-intel in #1555
- Check for the first finished parallel proccess by @maxzheng in #1546
New Contributors
Full Changelog: 2.9.14...2.9.15
2.9.14
Improvements
Fixes merging of log lines
This release fixes an issue with merging broken log lines.
Fix issue with using LD_LIBRARY_PATH
with Conda environments
In a Conda environment, it is sometimes necessary to set LD_LIBRARY_PATH
to first include the Conda's environment libraries before anything else. Prior to this release, this used to cause issues with the escape hatch.
What's Changed
- fix: GPU scheduling issue by @saikonen in #1520
- vend packaging by @savingoyal in #1506
- fix: vendor older version of packaging for python 3.5 by @saikonen in #1530
- Fix an issue when LD_LIBRARY_PATH is overriden in a Conda environment by @romain-intel in #1540
- Fix the merging of borked lines with MFLOG by @romain-intel in #1533
- Revert "fix: GPU scheduling issue" by @savingoyal in #1542
- Bump version to 2.9.14 by @saikonen in #1543
Full Changelog: 2.9.13...2.9.14
2.9.13
Bug fix
Revert annotations changes to fix a regression
The recent annotations feature introduced an issue where project
, flow_name
or user
annotations are not being populated for Kubernetes. This release reverts the changes.
What's Changed
- Revert "Adds custom annotations via env variables" by @savingoyal in #1516
- Bump to 2.9.13 by @savingoyal in #1517
Full Changelog: 2.9.12...2.9.13
2.9.12
Known issues
The annotations feature introduced in this release has an issue where project, flow_name or user annotations are not being populated for Kubernetes. This has been reverted in the next release.
Features
Custom annotations for K8S and Argo Workflows
This release enables users to add custom annotations to the Kubernetes resources that Flows create. The annotations can be configured much in the same way as custom labels
- Globally with an environment variable. For example with
export METAFLOW_KUBERNETES_ANNOTATIONS="first=A,second=B"
- At a step level by passing a dictionary to the Kubernetes decorator.
@kubernetes(annotations={"first": "A", "second": "B"})
What's Changed
- Adds custom annotations via env variables by @tylerpotts in #1442
- Pass the user-defined executable to environment's
executable
by @romain-intel in #1454 - Remove validate_environment from task lifecycle by @savingoyal in #1507
- Fix/863 - Improve error message in metaflow.S3 class when DATATOOLS_S3ROOT is not configured. by @tfurmston in #1491
- Fix an issue where 0 was not considered False for extension debug opt… by @romain-intel in #1511
- Bump version to 2.9.12 by @saikonen in #1514
Full Changelog: 2.9.11...2.9.12
2.9.11
Bug Fix
Fix regression for @Batch decorator introduced by v2.9.10
This release reverts a validation fix introduced in 2.9.10, which prevented executions of Metaflow tasks on AWS Batch
What's Changed
- Revert "fix: validate required configuration for Batch" by @savingoyal in #1486
- Bump version to 2.9.11 by @savingoyal in #1487
Full Changelog: 2.9.10...2.9.11
2.9.10
Features
Introduce PagerDuty support for workflows running on Argo Workflows
With this release, Metaflow users can get events on PagerDuty when their workflows succeed or fail on Argo Workflows.
Setting up the notifications is similar to the existing Slack notifications support
- Follow these instructions on PagerDuty to set up an Events API V2 integration for your PagerDuty service
- You should be able to view the required integration key from the Events API V2 dropdown
- To enable notifications on PagerDuty when your Metaflow flow running on Argo Workflows succeeds or fails, deploy it using the --notify-on-error or --notify-on-success flags:
python flow.py argo-workflows create --notify-on-error --notify-on-success --notify-pager-duty-integration-key <pager-duty-integration-key>
- You can also set the following environment variable instead of specifying --notify-slack-webhook-url on the CLI everytime
METAFLOW_ARGO_WORKFLOWS_CREATE_NOTIFY_PAGER_DUTY_INTEGRATION_KEY=<pager-duty-integration-key>
- Next time the flow fails or succeeds, you should receive a new event on PagerDuty under Incidents (Flow failed) or Changes (Flow succeeded)
What's Changed
- fix: validate required configuration for Batch by @saikonen in #1483
- feature: add PagerDuty support for Argo Workflows by @saikonen in #1478
- Bump version to 2.9.10 by @saikonen in #1484
Full Changelog: 2.9.9...2.9.10
2.9.9
Improvements
Fixes a bug with the S3 operations affecting @conda
with some S3 providers
This release fixes a bug with the @conda
bootstrapping process. There was an issue with the ServerSideEncryption
support, that affected some of the S3 operations when using S3 providers that do not implement the encryption headers (for example MinIO).
Affected operations were all that handle multiple files at once:
get_many / get_all / get_recursive / put_many / info_many
which are used as part of bootstrapping a @conda
environment when executing remotely.
What's Changed
- fix: s3 op bug with ServerSideEncryption by @saikonen in #1479
- Bump version to 2.9.9 by @saikonen in #1480
Full Changelog: 2.9.8...2.9.9
2.9.8
Improvements
Fixes bug with Argo events parameters
This release fixes an issue with mapping values with spaces from the Argo events payload to flow parameters.
What's Changed
- sanitize / in secret names by @oavdeev in #1470
- chore: upgrade packages in cards plugin by @saikonen in #1473
- fix: Argo events parameters with spaces by @saikonen in #1475
- allow to customize env var name in
@secrets
by @oavdeev in #1474 - Bump version to 2.9.8 by @saikonen in #1476
Full Changelog: 2.9.7...2.9.8
2.9.7
Features
New commands for managing Argo Workflows through the CLI
This release includes new commands for managing workflows on Argo Workflows.
When needed, commands can be authorized by supplying a production token with --authorize
.
argo-workflows delete
A deployed workflow can be deleted through the CLI with
python flow.py argo-workflows delete
argo-workflows terminate
A run can be terminated mid-execution through the CLI with
python flow.py argo-workflows terminate RUN_ID
argo-workflows suspend/unsuspend
A run can be suspended temporarily with
python flow.py argo-workflows suspend RUN_ID
Note that the suspended flow will show up as failed on Metaflow-UI after a period, due to this also suspending the heartbeat process. Unsuspending will resume the flow and its status will show as running again. This can be done with
python flow.py argo-workflows unsuspend RUN_ID
Improvements
Faster Job completion checks for Kubernetes
Previously the status for tasks running on Kubernetes was determined through the pod status, which can take a while to update after the last container finishes. This release changes the status checks to use container statuses directly instead.
What's Changed
- Job completion check based on container status. by @shrinandj in #1369
- feature: add argo workflows suspend command by @saikonen in #1420
- feature: add delete and terminate for argo workflows by @saikonen in #1307
- Bump version to 2.9.7 by @saikonen in #1467
Full Changelog: 2.9.6...2.9.7
2.9.6
Features
AWS Step Function state machines can now be deleted through the CLI
This release introduces the command step-functions delete
for deleting state machines through the CLI.
For a regular flow
python flow.py step-functions delete
For another users project branch
Comment out the @project
decorator from the flow file, as we do not allow using --name
with projects.
python project_flow.py step-functions --name project_a.user.saikonen.ProjectFlow delete
For a production or custom branch flow
python project_flow.py --production step-functions delete
# or
python project_flow.py --branch custom step-functions delete
add --authorize PRODUCTION_TOKEN
to the command if you do not have the correct production token locally
Improvements
Fixes a bug with the S3 server side encryption feature with some S3 compliant providers.
This release fixes an issue with the S3 server side encryption support, where some S3 compliant providers do not respond with the expected encryption method in the payload. This bug specifically affected regular operation when using MinIO.
Fixes support for --with environment
in Airflow
Fixes a bug with the Airflow support for environment variables, where the env values set in the environment decorator could get overwritten.
What's Changed
- [bugfix] support
--with environment
in Airflow by @valayDave in #1459 - feat: sfn delete workflow (with prod token validation and messaging) by @stevenhoelscher, @saikonen in #1379
- [bugfix]: Optional check for encryption in s3op response by @valayDave in #1460
- Bump version to 2.9.6 by @saikonen in #1461
Full Changelog: 2.9.5...2.9.6