dagrun_timeout of Airflow
In Apache Airflow, you can set a timeout for an entire DAG run using the dagrun_timeout
parameter in the DAG definition. This parameter allows you to specify the maximum amount of time a DAG run is allowed to execute. If the DAG run exceeds this time limit, it will be marked as failed, and all running tasks in that run will be terminated.
Setting dagrun_timeout
Here's how to set a timeout for a DAG run:
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime, timedelta
# Define the DAG
with DAG(
'my_dag_with_timeout',
description='A DAG with a run timeout',
schedule_interval='@daily',
start_date=datetime(2023, 1, 1),
catchup=False,
dagrun_timeout=timedelta(hours=2) # Set a 2-hour timeout for the entire DAG run
) as dag:
# Define tasks
start = DummyOperator(task_id='start')
end = DummyOperator(task_id='end')
# Define task dependencies
start >> end
Explanation
dagrun_timeout
: Specifies the maximum time allowed for a DAG run. In the example,dagrun_timeout=timedelta(hours=2)
limits the DAG run to 2 hours.- Behavior on Timeout: If the DAG run exceeds the timeout, it is marked as
failed
, and any running tasks will be stopped.
Additional Timeout Options for Tasks
In addition to dagrun_timeout
for the entire DAG, you can set timeouts for individual tasks using the execution_timeout
parameter:
from airflow.operators.python_operator import PythonOperator
def my_task():
# Task logic
pass
# Define a task with an execution timeout of 30 minutes
task = PythonOperator(
task_id='my_task',
python_callable=my_task,
execution_timeout=timedelta(minutes=30),
dag=dag
)
Summary
- Use
dagrun_timeout
for setting a timeout on the entire DAG run. - Use
execution_timeout
for setting a timeout on individual tasks within the DAG.
댓글
댓글 쓰기