技能 数据科学 Apache Airflow DAG最佳实践

Apache Airflow DAG最佳实践

v20260509
airflow-dag-patterns
本技能提供构建生产级Apache Airflow DAG(有向无环图)的综合指南。它涵盖了操作符、传感器、测试、部署的最佳实践,用于设计健壮的数据管道。当您需要编排复杂的业务流程、管理ETL流程,或确保在生产环境中可靠地执行批处理任务时使用。
获取技能
425 次下载
概览

Apache Airflow DAG Patterns

Production-ready patterns for Apache Airflow including DAG design, operators, sensors, testing, and deployment strategies.

Use this skill when

  • Creating data pipeline orchestration with Airflow
  • Designing DAG structures and dependencies
  • Implementing custom operators and sensors
  • Testing Airflow DAGs locally
  • Setting up Airflow in production
  • Debugging failed DAG runs

Do not use this skill when

  • You only need a simple cron job or shell script
  • Airflow is not part of the tooling stack
  • The task is unrelated to workflow orchestration

Instructions

  1. Identify data sources, schedules, and dependencies.
  2. Design idempotent tasks with clear ownership and retries.
  3. Implement DAGs with observability and alerting hooks.
  4. Validate in staging and document operational runbooks.

Refer to resources/implementation-playbook.md for detailed patterns, checklists, and templates.

Safety

  • Avoid changing production DAG schedules without approval.
  • Test backfills and retries carefully to prevent data duplication.

Resources

  • resources/implementation-playbook.md for detailed patterns, checklists, and templates.

Limitations

  • Use this skill only when the task clearly matches the scope described above.
  • Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
  • Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.
信息
Category 数据科学
Name airflow-dag-patterns
版本 v20260509
大小 5.44KB
更新时间 2026-05-10
语言