添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
相关文章推荐
细心的羊肉串  ·  在SQL ...·  1 年前    · 
卖萌的单车  ·  Mismatch between ...·  1 年前    · 
活泼的板栗  ·  Android ...·  1 年前    · 
  • Kubernetes Execution Mode
  • Azure Container Instance Execution Mode
  • AWS Container Run Job Execution Mode
  • GCP Cloud Run Job Execution Mode
  • Airflow Async Execution Mode
  • dbt and Airflow Similar Concepts
  • Operators
  • Custom Airflow Properties
  • Docker Execution Mode #

    The following tutorial illustrates how to run the Cosmos dbt Docker Operators and the required setup for them.

    Requirements #

  • Docker with docker daemon (Docker Desktop on MacOS). Follow the Docker installation guide .

  • Airflow

  • Astronomer-cosmos package containing the dbt Docker operators

  • Postgres docker container

  • Docker image built with required dbt project and dbt DAG

  • dbt DAG with dbt docker operators in the Airflow DAGs directory to run in Airflow

  • More information on how to achieve 2-6 is detailed below.

    Step-by-step instructions #

    Install Airflow and Cosmos

    Create a python virtualenv, activate it, upgrade pip to the latest version and install Apache Airflow® & astronomer-postgres

    python -m venv venv
    source venv/bin/activate
    pip install --upgrade pip
    pip install apache-airflow
    pip install "astronomer-cosmos[dbt-postgres]"
    

    Setup Postgres database

    You will need a postgres database running to be used as the database for the dbt project. Run the following command to run and expose a postgres database

    docker run --name some-postgres -e POSTGRES_PASSWORD="<postgres_password>" -e POSTGRES_USER=postgres -e POSTGRES_DB=postgres -p5432:5432 -d postgres
    

    Build the dbt Docker image

    For the Docker operators to work, you need to create a docker image that will be supplied as image parameter to the dbt docker operators used in the DAG.

    Clone the cosmos-example repo

    git clone https://github.com/astronomer/cosmos-example.git
    cd cosmos-example
    

    Create a docker image containing the dbt project files and dbt profile by using the Dockerfile, which will be supplied to the Docker operators.

    docker build -t dbt-jaffle-shop:1.0.0 -f Dockerfile.postgres_profile_docker_k8s .
    

    If running on M1, you may need to set the following envvars for running the docker build command in case it fails

    export DOCKER_BUILDKIT=0
    export COMPOSE_DOCKER_CLI_BUILD=0
    export DOCKER_DEFAULT_PLATFORM=linux/amd64
    

    Take a read of the Dockerfile to understand what it does so that you could use it as a reference in your project.

  • The dbt profile file is added to the image

  • The dags directory containing the dbt project jaffle_shop is added to the image

  • The dbt_project.yml is replaced with postgres_profile_dbt_project.yml which contains the profile key pointing to postgres_profile as profile creation is not handled at the moment for K8s operators like in local mode.

  • Setup and Trigger the DAG with Airflow

    Copy the dags directory from cosmos-example repo to your Airflow home

    cp -r dags $AIRFLOW_HOME/
    

    Run Airflow

    airflow standalone
    

    You might need to run airflow standalone with sudo if your Airflow user is not able to access the docker socket URL or pull the images in the Kind cluster.

    Log in to Airflow through a web browser http://localhost:8080/, using the user airflow and the password described in the standalone_admin_password.txt file.

    Enable and trigger a run of the jaffle_shop_docker DAG. You will be able to see the following successful DAG run.

    Specifying ProfileConfig#

    Starting with Cosmos 1.8.0, you can use the profile_config argument in your Dbt DAG Docker operators to reference profiles for your dbt project defined in a profiles.yml file. To do so, provide the file’s path via the profiles_yml_path parameter in profile_config.

    Note that in ExecutionMode.DOCKER, the profile_config is only compatible with the profiles_yml_path approach. The profile_mapping method will not work because the required Airflow connections cannot be accessed within the Docker container to map them to the dbt profile.