prefect-dbt
, you can trigger and observe dbt Cloud jobs, execute dbt Core CLI commands, and incorporate other tools, such as Snowflake, into your dbt runs.
Prefect provides a global view of the state of your workflows and allows you to take action based on state changes.
Prefect integrations may provide pre-built blocks, flows, or tasks for interacting with external systems.
Block types in this library allow you to do things such as run a dbt Cloud job or execute a dbt Core command.
prefect-dbt
prefect-dbt
compatible with your installed version of prefect
.
If you don’t already have prefect
installed, it will install the newest version of prefect
as well.
prefect
and prefect-dbt
:
prefect-dbt
module to make them available for use.
run_dbt_cloud_job
to trigger a job run and wait until the job run is finished. If some nodes fail, run_dbt_cloud_job
can efficiently retry the unsuccessful nodes. Prior to running this flow, save your dbt Cloud credentials to a DbtCloudCredentials block and create a dbt Cloud Job block:
https://cloud.getdbt.com/settings/accounts/<ACCOUNT_ID>
.https://cloud.getdbt.com/deploy/<ACCOUNT_ID>/projects/<PROJECT_ID>/jobs/<JOB_ID>
prefect-dbt
include the PrefectDbtRunner
class, which provides an improved interface for running dbt Core commands with better logging, failure handling, and automatic asset lineage.
PrefectDbtRunner
is inspired by the DbtRunner
from dbt Core, and its invoke
method accepts the same arguments.
Refer to the DbtRunner
documentation for more information on how to call invoke
..invoke()
in a flow or task, each node in dbt’s execution graph is reflected as a task in Prefect’s execution graph.
Logs from each node will belong to the corresponding task, and each task’s state is determined by the state of that node’s execution.
.invoke()
run separately from dbt Core, and do not affect dbt’s execution behavior.
These tasks do not persist results and cannot be cached.Use dbt’s native retry functionality in combination with runtime data from prefect
to retry failed nodes.PrefectDbtRunner
.
The upstream dependencies of an asset materialized by prefect-dbt
are derived from the depends_on
field in dbt’s manifest.json
.
The asset’s key
will be its corresponding dbt resource’s relation_name
.
The name
and description
asset properties are populated by a dbt resource’s name description.
The owners
asset property is populated if there is data assigned to the owner
key under a resoure’s meta
config.
PrefectDbtSettings
class, based on Pydantic’s BaseSettings
class, automatically detects DBT_
-prefixed environment variables that have a direct effect on the PrefectDbtRunner
class.
If no environment variables are set, dbt’s defaults are used.
Provide a PrefectDbtSettings
instance to PrefectDbtRunner
to customize dbt settings or override environment variables.
PrefectDbtRunner
class maps all dbt log levels to standard Python logging levels, so filtering for log levels like WARNING
or ERROR
in the Prefect UI applies to dbt’s logs.
By default, the logging level used by dbt is Prefect’s logging level, which can be configured using the PREFECT_LOGGING_LEVEL
Prefect setting.
The dbt logging level can be set independently from Prefect’s by using the DBT_LOG_LEVEL
environment variable, setting log_level
in PrefectDbtSettings
, or passing the --log-level
flag or log_level
kwarg to .invoke()
.
Only logging levels of higher severity (more restrictive) than Prefect’s logging level will have an effect.
profiles.yml
templatingPrefectDbtRunner
class supports templating in your profiles.yml
file, allowing you to reference Prefect blocks and variables that will be resolved at runtime.
This enables you to store sensitive credentials securely using Prefect blocks, and configure different targets based on the Prefect workspace.
For example, a Prefect variable called target
can have a different value in development (dev
) and production (prod
) workspaces.
This allows you to use the same profiles.yml
file to automatically reference a local DuckDB instance in development and a Snowflake instance in production.
PrefectDbtRunner
’s raise_on_failure
option can be set to False
to prevent failures in dbt from causing the failure of the flow or task in which .invoke()
is called.
prefect-dbt
supports a couple of ways to run dbt Core commands.
A DbtCoreOperation
block will run the commands as shell commands, while other tasks use dbt’s Programmatic Invocation.
Optionally, specify the project_dir
.
If profiles_dir
is not set, the DBT_PROFILES_DIR
environment variable will be used.
If DBT_PROFILES_DIR
is not set, the default directory will be used $HOME/.dbt/
.
profiles.yml
file, specify the profiles_dir
where the file is located:
profiles.yml
with a DbtCliProfile
block.
profiles.yml
, set a Prefect Secret block as an environment variable:
profiles.yml
file could then access that variable.
profiles.yml
file with blocksprofiles.yml
file, you can use a DbtCliProfile block to create profiles.yml
.
Then, specify profiles_dir
where profiles.yml
will be written.
Here’s example code with placeholders:
dbt_cli_profile
argument will overwrite existing profiles.yml
filesIf you already have a profiles.yml
file in the specified profiles_dir
, the file will be overwritten. If you do not specify a profiles directory, profiles.yml
at ~/.dbt/
would be overwritten.TargetConfigs
blocks.
If the desired service profile is not available, you can build one from the generic TargetConfigs
class.
prefect-dbt
has some pre-built tasks that use dbt’s programmatic invocation.
For example:
GcpCredentials
block.prefect-dbt
SDK documentation to explore all the capabilities of the prefect-dbt
library.