Azure Data Factory Error — Substantial Concurrent MappingDataflow Executions

SqlInSix Tech Blog
2 min readMar 5, 2024

--

Solving “Type=Microsoft.DataTransfer.Execution.Core.ExecutionException,Message=There are substantial concurrent MappingDataflow executions which is causing failures due to throttling under Integration Runtime ‘AutoResolveIntegrationRuntime’.”
Solving “Type=Microsoft.DataTransfer.Execution.Core.ExecutionException,Message=There are substantial concurrent MappingDataflow executions which is causing failures due to throttling under Integration Runtime ‘AutoResolveIntegrationRuntime’.”

Error

Type=Microsoft.DataTransfer.Execution.Core.ExecutionException,Message=There are substantial concurrent MappingDataflow executions which is causing failures due to throttling under Integration Runtime ‘AutoResolveIntegrationRuntime’.

Background

Outside of any quota limit increase, Azure Data Factory has limits on the objects within a data factory, such as pipelines, data sets, linked services, integrated runtimes, etc. For instance, at this present time per Microsoft, the current maximum of these is set to 5,000. There’s also granuarlity here within each of these objects, such as a limit of 40 activities per pipeline as an example.

There are limits for concurrent pipeline activity runs provided that we’re not using a self-hosted integration runtime (SHIR). If we’re using the AutoResolveIntegrationRuntime within Azure and using its resources, then we need to prepare for hitting this limit early by considering how we’ll scale if our activity runs began to rise.

Possible Solutions

  1. One solution which may or may not be available to us is to host our own integration runtime. As we see, our own SHIR does not have this limit.
  2. Functionally or horizontally scale our pipelines across multiple integration runtimes. While it’s possible that we could scale our pipelines horizontally through a programmatic calling of them if we’ve developed them by using a code re-use model, we could also scale our pipelines across multiple integration runtimes functionally (think different parts of the business). In general at the start, we want to develop our pipelines for re-use for this reason; if we’ve done this, this step is extremely easy and allows us to scale easier.
  3. The quickest fix, but one that will still introduce problems if you’re environment is growing is to adjust the times of pipelines — if you can schedule pipelines to execute at different times, this is the quick fix. It’s possible that we’re running all our same processes at the same time and we can delineate other times, such as an early morning pipeline and an after-the-business-day-ends pipeline.

Note: all images in the post are either created through actual code runs or from Pixabay. The written content of this post is copyright; all rights reserved. None of the written content in this post may be used in any artificial intelligence.

--

--

SqlInSix Tech Blog
SqlInSix Tech Blog

Written by SqlInSix Tech Blog

I speak and write about research and data. Given my increased speaking frequency, I write three articles per year here.

No responses yet