Learn about task runners for concurrent, parallel or distributed execution of tasks.
.submit()
or .map()
methods to submit tasks to a task runner.
The default task runner in Prefect is the ThreadPoolTaskRunner
,
which runs tasks concurrently in independent threads.
For truly parallel or distributed task execution, you must use one of the following task runners, which are available as extras of the prefect
library:
DaskTaskRunner
can run tasks using dask.distributed
(install prefect[dask]
)RayTaskRunner
can run tasks using Ray (install prefect[ray]
)task_runner
keyword argument to the parent flow:
max_workers
parameter of the ThreadPoolTaskRunner
controls the number of threads that the task runner will use to execute tasks concurrently.
.submit()
to submit a task to a task runner, the task runner creates a
PrefectFuture
for access to the state and
result of the task.
A PrefectFuture
is an object that provides:
State
indicating the current state of the task runPrefectFuture
objects must be resolved explicitly before returning from a flow or task.
Dependencies between futures will be automatically resolved whenever their dependents are resolved.
This means that only terminal futures need to be resolved, either by:.wait()
or .result()
on each terminal futurewait
or as_completed
utilities to resolve terminal futuresPrefectFuture
you passed as an argument.
Instead, the downstream task receives the value that the upstream task returned.
For example:
print_result
future as Prefect automatically resolved say_hello
to a string.
.result()
method.
.result()
method waits for the task to complete before returning the result to the caller.
If the task run fails, .result()
will raise the task run’s exception. Disable this behavior
with the raise_on_failure
option:
.result()
.result()
is a blocking call.
This means that calling .result()
will block the caller until the task run completes..result()
when you need to interact directly with the return value of your submitted task;
for example, you should use .result()
if passing the return value to a standard Python function (not a
Prefect task) but do not need to use .result()
if you are passing the value to another task (as these futures will be automatically resolved)..map
can create many task instances very quickly, which might exceed your resource availability.
DaskTaskRunner
or RayTaskRunner
(or propose a new task runner type).
.submit
and .map
must respect the underlying primitives to avoid unexpected behavior. For example, the default ThreadPoolTaskRunner
runs each task in a separate thread, which means that any global state must be threadsafe. Similarly, DaskTaskRunner
and RayTaskRunner
are multi-process runners that require global state to be picklable.