Completed
state and return a predetermined
value without actually running the code that defines the task.
Caching allows you to efficiently reuse results of tasks that may be expensive to compute
and ensure that your pipelines are idempotent when retrying them due to unexpected failure.
By default Prefect’s caching logic is based on the following attributes of a task invocation:
PREFECT_RESULTS_PERSIST_BY_DEFAULT
setting:Cached
state with the corresponding result value.
Cache keys can be shared by the same task across different flows, and even among different tasks,
so long as they all share a common result storage location.
By default Prefect stores results locally in ~/.prefect/storage/
.
The filenames in this directory will correspond exactly to computed cache keys from your task runs.
persist_result=False
.DEFAULT
: this cache policy uses the task’s inputs, its code definition, as well as the prevailing flow run ID
to compute the task’s cache key.INPUTS
: this cache policy uses only the task’s inputs to compute the cache key.TASK_SOURCE
: this cache policy only considers raw lines of code in the task (and not the source code of nested tasks) to compute the cache key.FLOW_PARAMETERS
: this cache policy uses only the parameter values provided to the parent flow run
to compute the cache key.NO_CACHE
: this cache policy always returns None
and therefore avoids caching and result persistence altogether.cache_policy
keyword on the task decorator.
cache_expiration
keyword on
the task decorator.
This keyword accepts a datetime.timedelta
specifying a duration for which the cached value should be
considered valid.
Providing an expiration value results in Prefect persisting an expiration timestamp alongside the result
record for the task.
This expiration is then applied to all other tasks that may share this cache key.
NO_CACHE
can be added together to form new policies that combine
the individual policies’ logic into a larger cache key computation.
Combining policies in this way results in caches that are easier to invalidate.
For example:
x
, or anytime you change the underlying code.
The INPUTS
policy is a special policy that allows you to subtract string values to ignore
certain task inputs:
TaskRunContext
, which stores task run metadata. For example,
this object has attributes task_run_id
, flow_run_id
, and task
, all of which can be used in your
custom logic.fn(x, y, z)
then the dictionary will have keys “x”, “y”, and “z” with corresponding values that can be used to compute your cache key.cache_key_fn
argument on
the task decorator.
For example:
key_storage
argument allows cache records to be stored separately from task results.
When cache key storage is configured, persisted task results will only include the return value of your task and cache records can be deleted or modified
without effecting your task results.
You can configure where cache records are stored by using the .configure
method with a key_storage
argument on a cache policy.
The key_storage
argument accepts either a path to a local directory or a storage block.
READ_COMMITTED
and SERIALIZABLE
.
By default, cache records operate with a READ_COMMITTED
isolation level. This guarantees that reading a cache record will see the latest committed cache value,
but allows multiple executions of the same task to occur simultaneously.
For stricter isolation, you can use the SERIALIZABLE
isolation level. This ensures that only one execution of a task occurs at a time for a given cache
record via a locking mechanism.
To configure the isolation level, use the .configure
method with an isolation_level
argument on a cache policy. When using SERIALIZABLE
, you must
also provide a lock_manager
that implements locking logic for your system.
Execution Context | Recommended Lock Manager | Notes |
---|---|---|
Threads/Coroutines | MemoryLockManager | In-memory locking suitable for single-process execution |
Processes | FileSystemLockManager | File-based locking for multiple processes on same machine |
Multiple Machines | RedisLockManager | Distributed locking via Redis for cross-machine coordination |