How I Set Up Database Migrations for my Serverless Flask App Deployed with Zappa

This blog post assumes you are familiar with

Flask
Flask-Migrate (alembic revisions for Flask apps)
Zappa (serverless deployment on aws)

Setting Up Neovim on WSL2

Let’s keep this short an sweet, assumptions:

You’ve set up WSL2
You are using an Ubuntu distribution

Nodejs and npm setup on WSL

Recently, I’ve been focused on developing a Chrome extension, Skater. This extenion started off as a hackathon project among friends, resulting in a scrappy, messy codebase written in vanilla js. While a lot of fun to develop at the time, revisiting and making changes without a testing framework in place has been a headache. I’ve made the decision to revisit the extension and implement a testing suite with jest. You can follow along with those updates on the jest-implement branch. Being an occasional Windows Subsystem for Linux (WSL) user, I wanted to get node and npm set up properly, so that I can switch between Mac and PC at will.

Let's Talk dbt

dbt (data build tool) is used to configure, structure, and visualize datbase objects including tables, views, and processes (the T part of ELT). Dbt lends itself to an event-driven architecture where we have some number of raw tables that are routinely populated in the databse. Any tables that are downstream of these raw tables are materialized with dbt. It is an important distinction here that dbt will not help you with getting raw data into your database, but will help you model and track the flow of the data downstream.

Templated SQL with Jinja

In data warehousing, we often encounter repetitive processes that can benefit from templating. This is a simple example of creating a COPY INTO statement using some JSON.

Idempotency - Why it matters

In DE, Idempotency is the idea that a single ETL job or process will produce the same end result regardless of how many times you re-run the job. That means that if you have a DAG that runs on 6/15/2020, then if you clear and run that DAG 1000x, your data warehouse will still hold the exact same data, no duplicates. This concept is extremely important and will save you time in the long run.

Date Partitioning with Airflow

How can we quickly partition data by date in something like an S3 Bucket?

Let's Talk Airflow Setup

Approaching Airflow can seem a little daunting. There are a few hundred Medium articles out there telling you how to set up Airflow, write a DAG, test a “Hello, World!” ETL. Usually the there is a lot to set up before any development, experimentation, or learning can take place. While the articles are helpful, they are often just way too long.

Evan Calzolaio

How I Set Up Database Migrations for my Serverless Flask App Deployed with Zappa

Setting Up Neovim on WSL2

Nodejs and npm setup on WSL

Let's Talk dbt

Templated SQL with Jinja

Idempotency - Why it matters

Date Partitioning with Airflow

Let's Talk Airflow Setup

Thinking Out Loud

This is a test