Skip to main content

prefect_databricks.models.jobs

Classes

AutoScale

See source code for the fields’ description.

AwsAttributes

See source code for the fields’ description.

CanManage

Permission to manage the job.

CanManageRun

Permission to run and/or manage runs for the job.

CanView

Permission to view the settings of the job.

ClusterCloudProviderNodeStatus

  • NotEnabledOnSubscription: Node type not available for subscription.
  • NotAvailableInRegion: Node type not available in region.

ClusterEventType

  • CREATING: Indicates that the cluster is being created.
  • DID_NOT_EXPAND_DISK: Indicates that a disk is low on space, but adding disks would put it over the max capacity.
  • EXPANDED_DISK: Indicates that a disk was low on space and the disks were expanded.
  • FAILED_TO_EXPAND_DISK: Indicates that a disk was low on space and disk space could not be expanded.
  • INIT_SCRIPTS_STARTING: Indicates that the cluster scoped init script has started.
  • INIT_SCRIPTS_FINISHED: Indicates that the cluster scoped init script has finished.
  • STARTING: Indicates that the cluster is being started.
  • RESTARTING: Indicates that the cluster is being started.
  • TERMINATING: Indicates that the cluster is being terminated.
  • EDITED: Indicates that the cluster has been edited.
  • RUNNING: Indicates the cluster has finished being created. Includes the number of nodes in the cluster and a failure reason if some nodes could not be acquired.
  • RESIZING: Indicates a change in the target size of the cluster (upsize or downsize).
  • UPSIZE_COMPLETED: Indicates that nodes finished being added to the cluster. Includes the number of nodes in the cluster and a failure reason if some nodes could not be acquired.
  • NODES_LOST: Indicates that some nodes were lost from the cluster.
  • DRIVER_HEALTHY: Indicates that the driver is healthy and the cluster is ready for use.
  • DRIVER_UNAVAILABLE: Indicates that the driver is unavailable.
  • SPARK_EXCEPTION: Indicates that a Spark exception was thrown from the driver.
  • DRIVER_NOT_RESPONDING: Indicates that the driver is up but is not responsive, likely due to GC.
  • DBFS_DOWN: Indicates that the driver is up but DBFS is down.
  • METASTORE_DOWN: Indicates that the driver is up but the metastore is down.
  • NODE_BLACKLISTED: Indicates that a node is not allowed by Spark.
  • PINNED: Indicates that the cluster was pinned.
  • UNPINNED: Indicates that the cluster was unpinned.

ClusterInstance

See source code for the fields’ description.

ClusterSize

See source code for the fields’ description.

ClusterSource

  • UI: Cluster created through the UI.
  • JOB: Cluster created by the Databricks job scheduler.
  • API: Cluster created through an API call.

ClusterState

  • PENDING: Indicates that a cluster is in the process of being created.
  • RUNNING: Indicates that a cluster has been started and is ready for use.
  • RESTARTING: Indicates that a cluster is in the process of restarting.
  • RESIZING: Indicates that a cluster is in the process of adding or removing nodes.
  • TERMINATING: Indicates that a cluster is in the process of being destroyed.
  • TERMINATED: Indicates that a cluster has been successfully destroyed.
  • ERROR: This state is no longer used. It was used to indicate a cluster that failed to be created. TERMINATING and TERMINATED are used instead.
  • UNKNOWN: Indicates that a cluster is in an unknown state. A cluster should never be in this state.

ClusterTag

See source code for the fields’ description. An object with key value pairs. The key length must be between 1 and 127 UTF-8 characters, inclusive. The value length must be less than or equal to 255 UTF-8 characters. For a list of all restrictions, see AWS Tag Restrictions: <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.html#tag-restrictions>

CronSchedule

See source code for the fields’ description.

DbfsStorageInfo

See source code for the fields’ description.

DbtOutput

See source code for the fields’ description.

DbtTask

See source code for the fields’ description.

DockerBasicAuth

See source code for the fields’ description.

DockerImage

See source code for the fields’ description.

Error

See source code for the fields’ description.

FileStorageInfo

See source code for the fields’ description.

GitSnapshot

See source code for the fields’ description. Read-only state of the remote repository at the time the job was run. This field is only included on job runs.

GitSource

See source code for the fields’ description. This functionality is in Public Preview. An optional specification for a remote repository containing the notebooks used by this job’s notebook tasks.

GitSource1

See source code for the fields’ description.

GroupName

See source code for the fields’ description.

IsOwner

Perimssion that represents ownership of the job.

JobEmailNotifications

See source code for the fields’ description.

LibraryInstallStatus

  • PENDING: No action has yet been taken to install the library. This state should be very short lived.
  • RESOLVING: Metadata necessary to install the library is being retrieved from the provided repository. For Jar, Egg, and Whl libraries, this step is a no-op.
  • INSTALLING: The library is actively being installed, either by adding resources to Spark or executing system commands inside the Spark nodes.
  • INSTALLED: The library has been successfully instally.
  • SKIPPED: Installation on a Databricks Runtime 7.0 or above cluster was skipped due to Scala version incompatibility.
  • FAILED: Some step in installation failed. More information can be found in the messages field.
  • UNINSTALL_ON_RESTART: The library has been marked for removal. Libraries can be removed only when clusters are restarted, so libraries that enter this state remains until the cluster is restarted.

ListOrder

  • DESC: Descending order.
  • ASC: Ascending order.

RuntimeEngine

Decides which runtime engine to be use, e.g. Standard vs. Photon. If unspecified, the runtime engine is inferred from spark_version.

LogSyncStatus

See source code for the fields’ description.

MavenLibrary

See source code for the fields’ description.

NotebookOutput

See source code for the fields’ description.

NotebookTask

See source code for the fields’ description.

ParameterPair

See source code for the fields’ description. An object with additional information about why a cluster was terminated. The object keys are one of TerminationParameter and the value is the termination information.

PermissionLevel

See source code for the fields’ description.

PermissionLevelForGroup

See source code for the fields’ description.

PipelineTask

See source code for the fields’ description.

PoolClusterTerminationCode

  • INSTANCE_POOL_MAX_CAPACITY_FAILURE: The pool max capacity has been reached.
  • INSTANCE_POOL_NOT_FOUND_FAILURE: The pool specified by the cluster is no longer active or doesn’t exist.

PythonPyPiLibrary

See source code for the fields’ description.

PythonWheelTask

See source code for the fields’ description.

RCranLibrary

See source code for the fields’ description.

RepairRunInput

See source code for the fields’ description.

ResizeCause

  • AUTOSCALE: Automatically resized based on load.
  • USER_REQUEST: User requested a new size.
  • AUTORECOVERY: Autorecovery monitor resized the cluster after it lost a node.

RunLifeCycleState

  • PENDING: The run has been triggered. If there is not already an active run of the same job, the cluster and execution context are being prepared. If there is already an active run of the same job, the run immediately transitions into the SKIPPED state without preparing any resources.
  • RUNNING: The task of this run is being executed.
  • TERMINATING: The task of this run has completed, and the cluster and execution context are being cleaned up.
  • TERMINATED: The task of this run has completed, and the cluster and execution context have been cleaned up. This state is terminal.
  • SKIPPED: This run was aborted because a previous run of the same job was already active. This state is terminal.
  • INTERNAL_ERROR: An exceptional state that indicates a failure in the Jobs service, such as network failure over a long period. If a run on a new cluster ends in the INTERNAL_ERROR state, the Jobs service terminates the cluster as soon as possible. This state is terminal.
  • BLOCKED: The run is blocked on an upstream dependency.
  • WAITING_FOR_RETRY: The run is waiting for a retry.

RunNowInput

See source code for the fields’ description.

PipelineParams

See source code for the fields’ description.

RunParameters

See source code for the fields’ description.

RunResultState

  • SUCCESS: The task completed successfully.
  • FAILED: The task completed with an error.
  • TIMEDOUT: The run was stopped after reaching the timeout.
  • CANCELED: The run was canceled at user request.

RunState

See source code for the fields’ description. The result and lifecycle state of the run.

RunType

The type of the run.

S3StorageInfo

See source code for the fields’ description.

ServicePrincipalName

See source code for the fields’ description.

SparkConfPair

See source code for the fields’ description. An arbitrary object where the object key is a configuration property name and the value is a configuration property value.

SparkEnvPair

See source code for the fields’ description. An arbitrary object where the object key is an environment variable name and the value is an environment variable value.

SparkJarTask

See source code for the fields’ description.

SparkNodeAwsAttributes

See source code for the fields’ description.

SparkPythonTask

See source code for the fields’ description.

SparkSubmitTask

See source code for the fields’ description.

SparkVersion

See source code for the fields’ description.

SqlOutputError

See source code for the fields’ description.

SqlStatementOutput

See source code for the fields’ description.

SqlTaskAlert

See source code for the fields’ description.

SqlTaskDashboard

See source code for the fields’ description.

SqlTaskQuery

See source code for the fields’ description.

TaskDependency

See source code for the fields’ description.

TaskDependencies

See source code for the fields’ description. An optional array of objects specifying the dependency graph of the task. All tasks specified in this field must complete successfully before executing this task. The key is task_key, and the value is the name assigned to the dependent task. This field is required when a job consists of more than one task.

TaskDescription

See source code for the fields’ description.

TaskKey

See source code for the fields’ description.

TerminationCode

  • USER_REQUEST: A user terminated the cluster directly. Parameters should include a username field that indicates the specific user who terminated the cluster.
  • JOB_FINISHED: The cluster was launched by a job, and terminated when the job completed.
  • INACTIVITY: The cluster was terminated since it was idle.
  • CLOUD_PROVIDER_SHUTDOWN: The instance that hosted the Spark driver was terminated by the cloud provider. In AWS, for example, AWS may retire instances and directly shut them down. Parameters should include an aws_instance_state_reason field indicating the AWS-provided reason why the instance was terminated.
  • COMMUNICATION_LOST: Databricks lost connection to services on the driver instance. For example, this can happen when problems arise in cloud networking infrastructure, or when the instance itself becomes unhealthy.
  • CLOUD_PROVIDER_LAUNCH_FAILURE: Databricks experienced a cloud provider failure when requesting instances to launch clusters. For example, AWS limits the number of running instances and EBS volumes. If you ask Databricks to launch a cluster that requires instances or EBS volumes that exceed your AWS limit, the cluster fails with this status code. Parameters should include one of aws_api_error_code, aws_instance_state_reason, or aws_spot_request_status to indicate the AWS-provided reason why Databricks could not request the required instances for the cluster.
  • SPARK_STARTUP_FAILURE: The cluster failed to initialize. Possible reasons may include failure to create the environment for Spark or issues launching the Spark master and worker processes.
  • INVALID_ARGUMENT: Cannot launch the cluster because the user specified an invalid argument. For example, the user might specify an invalid runtime version for the cluster.
  • UNEXPECTED_LAUNCH_FAILURE: While launching this cluster, Databricks failed to complete critical setup steps, terminating the cluster.
  • INTERNAL_ERROR: Databricks encountered an unexpected error that forced the running cluster to be terminated. Contact Databricks support for additional details.
  • SPARK_ERROR: The Spark driver failed to start. Possible reasons may include incompatible libraries and initialization scripts that corrupted the Spark container.
  • METASTORE_COMPONENT_UNHEALTHY: The cluster failed to start because the external metastore could not be reached. Refer to Troubleshooting.
  • DBFS_COMPONENT_UNHEALTHY: The cluster failed to start because Databricks File System (DBFS) could not be reached.
  • DRIVER_UNREACHABLE: Databricks was not able to access the Spark driver, because it was not reachable.
  • DRIVER_UNRESPONSIVE: Databricks was not able to access the Spark driver, because it was unresponsive.
  • INSTANCE_UNREACHABLE: Databricks was not able to access instances in order to start the cluster. This can be a transient networking issue. If the problem persists, this usually indicates a networking environment misconfiguration.
  • CONTAINER_LAUNCH_FAILURE: Databricks was unable to launch containers on worker nodes for the cluster. Have your admin check your network configuration.
  • INSTANCE_POOL_CLUSTER_FAILURE: Pool backed cluster specific failure. Refer to Pools for details.
  • REQUEST_REJECTED: Databricks cannot handle the request at this moment. Try again later and contact Databricks if the problem persists.
  • INIT_SCRIPT_FAILURE: Databricks cannot load and run a cluster-scoped init script on one of the cluster’s nodes, or the init script terminates with a non-zero exit code. Refer to Init script logs.
  • TRIAL_EXPIRED: The Databricks trial subscription expired.

TerminationParameter

See source code for the fields’ description.

TerminationType

  • SUCCESS: Termination succeeded.
  • CLIENT_ERROR: Non-retriable. Client must fix parameters before reattempting the cluster creation.
  • SERVICE_FAULT: Databricks service issue. Client can retry.
  • CLOUD_FAILURECloud provider infrastructure issue. Client can retry after the underlying issue is resolved.

TriggerType

  • PERIODIC: Schedules that periodically trigger runs, such as a cron scheduler.
  • ONE_TIME: One time triggers that fire a single run. This occurs you triggered a single run on demand through the UI or the API.
  • RETRY: Indicates a run that is triggered as a retry of a previously failed run. This occurs when you request to re-run the job in case of failures.

UserName

See source code for the fields’ description.

ViewType

  • NOTEBOOK: Notebook view item.
  • DASHBOARD: Dashboard view item.

ViewsToExport

  • CODE: Code view of the notebook.
  • DASHBOARDS: All dashboard views of the notebook.
  • ALL: All views of the notebook.

OnFailureItem

See source code for the fields’ description.

OnStartItem

See source code for the fields’ description.

OnSucces

See source code for the fields’ description.

WebhookNotifications

See source code for the fields’ description.

AccessControlRequestForGroup

See source code for the fields’ description.

AccessControlRequestForServicePrincipal

See source code for the fields’ description.

AccessControlRequestForUser

See source code for the fields’ description.

ClusterCloudProviderNodeInfo

See source code for the fields’ description.

ClusterLogConf

See source code for the fields’ description.

InitScriptInfo

See source code for the fields’ description.

Library

See source code for the fields’ description.

LibraryFullStatus

See source code for the fields’ description.

NewCluster

See source code for the fields’ description.

NodeType

See source code for the fields’ description.

RepairHistoryItem

See source code for the fields’ description.

SparkNode

See source code for the fields’ description.

SqlAlertOutput

See source code for the fields’ description.

SqlDashboardWidgetOutput

See source code for the fields’ description.

SqlQueryOutput

See source code for the fields’ description.

SqlTask

See source code for the fields’ description.

TerminationReason

See source code for the fields’ description.

ViewItem

See source code for the fields’ description.

AccessControlRequest

See source code for the fields’ description.

ClusterAttributes

See source code for the fields’ description.

ClusterInfo

See source code for the fields’ description.

ClusterLibraryStatuses

See source code for the fields’ description.

ClusterSpec

See source code for the fields’ description.

EventDetails

See source code for the fields’ description.

JobCluster

See source code for the fields’ description.

JobTask

See source code for the fields’ description.

JobTaskSettings

See source code for the fields’ description.

RepairHistory

See source code for the fields’ description.

RunSubmitTaskSettings

See source code for the fields’ description.

RunTask

See source code for the fields’ description.

SqlDashboardOutput

See source code for the fields’ description.

SqlOutput

See source code for the fields’ description.

AccessControlList

See source code for the fields’ description.

ClusterEvent

See source code for the fields’ description.

JobParameter

See source code for the fields’ description.

JobSettings

See source code for the fields’ description.

Run

See source code for the fields’ description.

RunSubmitSettings

See source code for the fields’ description.

RunJobParameter

See source code for the fields’ description.

Job

See source code for the fields’ description.