How to automate dbt with GitHub Actions
Step by step guide on setting up GitHub Actions to automate dbt models, dbt tests and deploying dbt docs on GitHub pages
The Mid Engineer here, I mostly write about Data Engineering, but I also write about other engineering topics from time to time.
Here is a place where I share my knowledge and learnings.
Not subscribe yet?
In this post, we will look at automating the models build and testing process by GitHub Actions, also using the github pages action to host the dbt documentation page on github.
The architecture/ workflow will look something like this
What is GitHub workflow?
According to the official github website, GitHub Actions is a continuous integration and continuous delivery (CI/CD) platform that allows you to automate your build, test, and deployment pipeline. You can create workflows that build and test every pull request to your repository, or deploy merged pull requests to production.
Let’s jump right in!
Set up dbt with GitHub workflow
Assuming our repo looks something like this, a snowflake dev warehouse with bronze, silver and gold schema.
├── README.md
├── bronze_schema
│ ├── analyses
│ ├── dbt_packages
│ ├── dbt_project.yml
│ ├── logs
│ ├── macros
│ ├── models
│ ├── packages.yml
│ ├── seeds
│ ├── snapshots
│ ├── target
│ └── tests
├── silver_schema
│ ├── analyses
│ ├── dbt_packages
│ ├── dbt_project.yml
│ ├── logs
│ ├── macros
│ ├── models
│ ├── packages.yml
│ ├── seeds
│ ├── snapshots
│ ├── target
│ └── tests
└── gold_schema
├── analyses
├── dbt_packages
├── dbt_project.yml
├── logs
├── macros
├── models
├── packages.yml
├── seeds
├── snapshots
├── target
└── tests
Job Automation
Set up workflow yml file
On you github dbt project repo, create a new folder .github/workflows/job-automation.yml
.
Alternatively, you can also directly click on Actions > New actions, to import the workflow yml.
Under job-automation.yml
, we will have to add below, the workflow is now set to run either on-demand manual trigger or daily at 7 am.
Set up secrets/ env var for job
Navigate to Settings > Security > Actions, add all your secrets and/ or env variables for the job.
In our case this time, we will be setting the repository's secret.
Set up profiles.yml
On the root of the dbt github repo, create a profiles.yml
It will look somthing like this
And with the above set up, we now are able to automate our dbt models.
As an extra, we can also configure the workflow so that if there are any models failed, the workflow also send a notification message to a channel like emails, slack, telegram, webex etc.
The example below shows a task that send a notification message to a webex space when there are failed dbt model during the build process.
Hosting Documentation Page on GitHub
Next, let's work on building a workflow that build a gituhb page to host the dbt documentation for all bronze, silver and gold schemas.
Firstly, we will have to set the pages setting of the repo under settings > pages
After that, under the folder .github/workflows
dir, we can open a new yml workflow file looks like below:
Then under the root dir `./` we need to have a python file called dbt-docs.py
What this script essentially doing is to merge the front end manifests of each schema to a consolidated file so that the page can be rendered properly.
And that's it, when we now merge code to the main branch, the dbt documetation page will re-build and deploy automatically.
Some final thoughts...
This is just one use case; there's so much more GitHub Actions can do, such as ensuring code quality and security on branches based on git push, automating deployment on merging branches, etc.
That's it for now.
Thank you for reading, and have a nice day!
This blog post is initially published on my blog.
Before you leave
If you have any questions and want to discuss further, please leave a comment.
It might take you a few minutes to read the article, but it took me days to create quality content to share. I’d greatly appreciate it if you consider subscribing to receive my work.