Apache Airflow DAG is Failing Silently

python

So your Apache Airflow DAG is failing silently. Are you running an ETL on a huge dataset? This is a symptom of an Airflow instance without sufficient memory. Dig into your instances logs and you’ll probably see an evicted worker if your running your instance’s workers on Kubernetes. You’ll see similar logs wherever you run … Read more

Unable to connect to the server: getting credentials: exec: executable aws-iam-authenticator not found

bash

Setting up Kubernetes on AWS and got the unable to connect to the server: getting credentials: exec: executable aws-iam-authenticator not found error? Execute the following to solve your problem! Not going to bore you with fluff this time around. I am assuming that you have homebrew installed which might be a leap. Also I’m on … Read more

Functional Annotations in Python 3.x

python

Have you used Functional Annotations in Python 3.x? Maybe you’ve heard them mentioned? Regardless, let’s explore what they are and how they help us. Because if they don’t help us, then we probably shouldn’t care. The Problem You’re programming and don’t know what thisRandomFunction should return. Maybe it’s a bool maybe it’s a string who … Read more

How To Get Started With Apache Airflow?

Post about Apache Airflow Or Data Engineering

When Airbnb was scaling rapidly, they faced the problem of organizing complex data pipelines. To combat this and become a data-driven organization, Airbnb launched Apache Airflow in 2015, their custom-made open-source platform to manage complex workflows. In simple words, Apache Airflow is a platform where you can create, schedule, and monitor complex workflows using simple … Read more

What git branch am I in?

What git branch am I in? It’s an age old question I’ll ask myself maybe once an hour. I’ll make a big change, decide that I need to save the repo before I break something, I make a commit (often times with commitizen), and then git push origin … . But where was I pushing … Read more

Upload a Pandas DataFrame to DynamoDB using Python

python

So you’re trying to upload a Pandas DataFrame to DynamoDB using Python? Let’s take a step back first. Why are we using DynamoDB? What is DynamoDB? What is DyanmoDB DyanmoDB is a non-relational fully managed database product offered by Amazon’s cloud computing arm AWS. So why would you go the DynamoDB route vs MySQL, Postgres, … Read more