Nodejs and npm setup on WSL

Recently, I’ve been focused on developing a Chrome extension, Skater. This extenion started off as a hackathon project among friends, resulting in a scrappy, messy codebase written in vanilla js. While a lot of fun to develop at the time, revisiting and making changes without a testing framework in place has been a headache. I’ve made the decision to revisit the extension and implement a testing suite with jest. You can follow along with those updates on the jest-implement branch. Being an occasional Windows Subsystem for Linux (WSL) user, I wanted to get node and npm set up properly, so that I can switch between Mac and PC at will.

Read More

Let's Talk dbt

dbt (data build tool) is used to configure, structure, and visualize datbase objects including tables, views, and processes (the T part of ELT). Dbt lends itself to an event-driven architecture where we have some number of raw tables that are routinely populated in the databse. Any tables that are downstream of these raw tables are materialized with dbt. It is an important distinction here that dbt will not help you with getting raw data into your database, but will help you model and track the flow of the data downstream.

Read More

Templated SQL with Jinja

In data warehousing, we often encounter repetitive processes that can benefit from templating. This is a simple example of creating a COPY INTO statement using some JSON.

Read More

Idempotency - Why it matters

In DE, Idempotency is the idea that a single ETL job or process will produce the same end result regardless of how many times you re-run the job. That means that if you have a DAG that runs on 6/15/2020, then if you clear and run that DAG 1000x, your data warehouse will still hold the exact same data, no duplicates. This concept is extremely important and will save you time in the long run.

Read More

Let's Talk Airflow Setup

Approaching Airflow can seem a little daunting. There are a few hundred Medium articles out there telling you how to set up Airflow, write a DAG, test a “Hello, World!” ETL. Usually the there is a lot to set up before any development, experimentation, or learning can take place. While the articles are helpful, they are often just way too long.

Read More