-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
new post / 2024-08-12-Azure-data-factory-foreach-activity.md
- Loading branch information
Showing
1 changed file
with
31 additions
and
0 deletions.
There are no files selected for viewing
31 changes: 31 additions & 0 deletions
31
docs/posts/2024/2024-08-12-Azure-data-factory-foreach-activity.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
--- | ||
authors: | ||
- copdips | ||
categories: | ||
- azure | ||
- azure-data-factory | ||
comments: true | ||
date: | ||
created: 2024-08-12 | ||
--- | ||
|
||
# Azure Data Factory - ForEach Activity | ||
|
||
The `ForeEach` activity in Azure Data Factory has some important [limitations](https://learn.microsoft.com/en-us/azure/data-factory/control-flow-for-each-activity#limitations-and-workarounds). | ||
One of them is when working with the `batch` mode, it would be nice to embed only pipeline activities inside. | ||
|
||
<!-- more --> | ||
|
||
## Problem | ||
|
||
When you run the `ForEach` activity in `batch` mode, and you loop over a list of items, and inside the `ForEach`, you run some activities (not pipeline activity), you might find the same item is processed multiple times. | ||
The [doc](https://learn.microsoft.com/en-us/azure/data-factory/control-flow-for-each-activity#limitations-and-workarounds) already says that the `SetVariable` should not be used in the ForEach activity, as it will set the variable at pipeline level (the pipeline where hosts the ForEach activity), and it will be shared by all the iterations. | ||
|
||
But during my test, I found that even the `SetVariable` is not used, the same item might be still processed multiple times in `batch` mode. | ||
|
||
## Solution | ||
|
||
There are 2 solutions to this problem: | ||
|
||
1. If you want to keep the `batch` mode, use the ForEach activity with pipeline activity inside only, and send the item to the pipeline as a parameter. Pipeline is more a less like a function in programming. | ||
2. Or set the ForEach activity to `sequential` mode, then everything will work as expected, but it will be much slower. |