BigQuery Slot Contention – Slow BigQuery Jobs

Big Query

Are you managing a BigQuery environment at scale and dealing with slow BigQuery job execution? Does your company have hundreds or thousands of jobs running simultaneously and sometimes experience wildly slow query execution? This happens. It’s growing pains. We’ll go over a couple of methods of bringing your query speeds back up which might eventually … Read more

How to Make Your SQL Queries Blazing Fast on BigQuery

BigQuery

In this blog post we’re going to cover how to make your SQL queries blazing fast. So, BigQuery is an incredible tool for wrangling massive datasets and running SQL queries at scale. But let’s be honest—just because it can handle huge queries doesn’t mean you should throw inefficiency at it. The faster your queries run, … Read more

Partition Existence In BigQuery – Cheapest & Fastest Method

BigQuery

We’ll show you the cheapest & fastest way to find partition existence in BigQuery. Use BigQuery’s INFORMATION_SCHEMA to find the most recent partition ID. So instead of this: Do this: Need to translate the partition_id back into a date? Do this: Depending on your table size, this will save you both a ton of money … Read more

Pull A Domain From A Full Website Path In BigQuery

BigQuery

This post will show you how to pull a domain from a full website path in BigQuery. So let’s set the stage for a hypothetical. You own a URL shortener company. You want to partner with a website for whatever reason. You decide that you want to do analysis over the data you’ve streamed or … Read more

BigQuery’s Having Clause

BigQuery

In what situation would you want to use BigQuery’s having clause outside of an interview? We’ll go over a couple of use cases and how I use it as a Data Engineer for Reddit. My Setup What Is BigQuery? BigQuery is a data warehouse as a service. Google handles your compute, your storage, and does … Read more