parallelism
This defines the maximum number of task instances that can run concurrently in Airflow regardless of scheduler count and worker count. Generally, this value is reflective of the number of task instances with the running state in the metadata database.
scheduler와 worker의 개수와 상관없이 Airflow 내에서 동시에 구동할 수 있는 task 인스턴스의 최대 개수
dag_concurrency
Deprecated since version 2.2.0: The option has been moved to core.max_active_tasks_per_dag
2.2.0 버전부터 없어진 configuration이고, max_active_tasks_per_dag으로 대체되었다.
max_active_tasks_per_dag
The maximum number of task instances allowed to run concurrently in each DAG. To calculate the number of tasks that is running concurrently for a DAG, add up the number of running tasks for all DAG runs of the DAG. This is configurable at the DAG level with max_active_tasks, which is defaulted as max_active_tasks_per_dag. An example scenario when this would be useful is when you want to stop a new dag with an early start date from stealing all the executor slots in a cluster.
각각의 DAG 내부에서 구동할 수 있는 task 인스턴스의 최대 개수. dag_concurrency보다 더 직관적으로 뜻을 알 수 있도록 바뀐 느낌이다.
'Data Engineering > Airflow' 카테고리의 다른 글
airflow-webserver-monitor.pid is already locked (0) | 2022.05.26 |
---|---|
Dockeroperator의 Bind mount을 활용한 Airflow 운영 (0) | 2021.09.05 |
Airflow tutorial - 1 (0) | 2021.07.10 |