Modular SQL Queries with Unit Tests
I'm sure I'm not the only person on earth facing a complex and expensive analytical processing task. The one I've been working on for the past couple of years, runs on the GHTorrent 98.5 GB data set of GitHub process data. It comprises 99 SQL queries (2599 lines of SQL code in total) and takes more than 20 hours to run on a hefty server. To make the job's parts run efficiently and reliably I implemented simple-rolap, a bare-bones relational online analytical processing tool suite. To ensure the queries produce correct results, I wrote RDBUnit, a unit testing framework for relational database queries. Here is a quick overview on how to use the two.Continue reading "Modular SQL Queries with Unit Tests"