dagrun_timeout of Airflow

In Apache Airflow, you can set a timeout for an entire DAG run using the dagrun_timeout parameter in the DAG definition. This parameter allows you to specify the maximum amount of time a DAG run is allowed to execute. If the DAG run exceeds this time limit, it will be marked as failed, and all running tasks in that run will be terminated.

Setting dagrun_timeout

Here's how to set a timeout for a DAG run:

from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime, timedelta

# Define the DAG
with DAG(
    'my_dag_with_timeout',
    description='A DAG with a run timeout',
    schedule_interval='@daily',
    start_date=datetime(2023, 1, 1),
    catchup=False,
    dagrun_timeout=timedelta(hours=2)  # Set a 2-hour timeout for the entire DAG run
) as dag:

    # Define tasks
    start = DummyOperator(task_id='start')
    end = DummyOperator(task_id='end')

    # Define task dependencies
    start >> end

Explanation

  • dagrun_timeout: Specifies the maximum time allowed for a DAG run. In the example, dagrun_timeout=timedelta(hours=2) limits the DAG run to 2 hours.
  • Behavior on Timeout: If the DAG run exceeds the timeout, it is marked as failed, and any running tasks will be stopped.

Additional Timeout Options for Tasks

In addition to dagrun_timeout for the entire DAG, you can set timeouts for individual tasks using the execution_timeout parameter:

from airflow.operators.python_operator import PythonOperator

def my_task():
    # Task logic
    pass

# Define a task with an execution timeout of 30 minutes
task = PythonOperator(
    task_id='my_task',
    python_callable=my_task,
    execution_timeout=timedelta(minutes=30),
    dag=dag
)

Summary

  • Use dagrun_timeout for setting a timeout on the entire DAG run.
  • Use execution_timeout for setting a timeout on individual tasks within the DAG.

댓글

이 블로그의 인기 게시물

Spring JPA에서 데이터베이스 연결 풀의 기본 값

Avro + Grpc in python