Ian Mulvany

November 8, 2022

fast tools for data processing

I hardly cut any code any more, but that doesn't stop me being interested in tooling. My language of choice for many years has been python, and the following post: https://til.simonwillison.net/duckdb/parquet is a nice short overview of how to use DuckDB (https://duckdb.org) to query parquet files blazingly fast. 

Simon manages to sum over 17M rows in just 206ms. 

That tools like this, and data at this scale, are increasingly openly available, and free to use, is slightly mind-blowing.