Ian Mulvany

November 8, 2022

fast tools for data processing

I hardly cut any code any more, but that doesn't stop me being interested in tooling. My language of choice for many years has been python, and the following post: https://til.simonwillison.net/duckdb/parquet is a nice short overview of how to use DuckDB (https://duckdb.org) to query parquet files blazingly fast. 

Simon manages to sum over 17M rows in just 206ms. 

That tools like this, and data at this scale, are increasingly openly available, and free to use, is slightly mind-blowing. 

About Ian Mulvany

Hi, I'm Ian - I work on academic publishing systems. You can find out more about me at mulvany.net. I'm always interested in engaging with folk on these topics, if you have made your way here don't hesitate to reach out if there is anything you want to share, discuss, or ask for help with!