While running multiple independent tasks in parallel which in fact is the primary function of map_task
map_task
in Flyte allows for the execution of the task across a series of inputs within a single workflow node. As a result, there is no need to create individual nodes for each instance which increases the performance and stability in the system.
Here is what Flyte says about the use cases of map_task
:
Executing the same code logic on multiple inputs
Concurrent processing of multiple data batches
Hyperparameter optimization
To know more, visit the following links:
The code that caused the error is as follows:
from flytekit import map_task, task, workflow
@task
def do_something(value: str) -> str:
print(f"launched: {value}", flush=True)
time.sleep(60) # fakes long process time
return f"{value}-processed"
@workflow
def do_multiple_things() -> list[str]:
values = ["foo", "bar", "baz"]
return map_task(do_something)(value=values)
So here is the solution:
Local runs will not do parallel just yet. (Making flyte-kit execute local runs in parallel is part of a broader project that flyte has plans for someday, but no definite timeline).