Bit of an exploration of LLM deployment at companies in 2026.
xevix.hashnode.dev/agentic-abus...
Bit of an exploration of LLM deployment at companies in 2026.
xevix.hashnode.dev/agentic-abus...
We released DuckDB v1.5!
This release comes with a βfriendly CLIβ client, a new (opt-in) PEG parser, support for VARIANT types and many lakehouse features. It also ships a new network stack, a reworked geospatial extension, Azure writes and an ODBC scanner.
Read more at duckdb.org/2026/03/09/a...
Good non-technical summary of some of the various chips out there. www.youtube.com/watch?v=RBmO...
Yeah, that or the opposite where it's biased toward a tool because of training data. I think you can suggest a tool in the global CLAUDEmd file but would be more seamless from cowork plugin.
Claude cowork w/ Opus 4.6 is definitely smart, but got stuck on a data task, I stopped it, pointed it to DuckDB, done instantly. LLMs still have much to learn π€
Great overview of chip manufacturing with EUV. We really do have supercomputers in our pockets, on our desk and on our wrists. www.youtube.com/watch?v=B248...
ποΈ The slide decks and talk recordings of last Friday's developer meeting are out! duckdb.org/events/2026/...
Great talks at South Bay Systems hosted at databricks on xNVMe, fast SSD query processing, and using NPUs for DB work. Much work using DuckDB extensions. Need for async I/O as bottleneck a common topic, mainly at larger scale. luma.com/8a54z94d?tk=...
GPU-powered analytical query engines going mainstream? Needs nVidia GPU and limited to memory for now, but neat use of Substrait+Arrow for interop. DuckDB still easier to run anywhere, but this is useful for acceleration if needed.
developer.nvidia.com/blog/nvidia-...
How are the compile times?
The PyData Amsterdam 2025 keynote βMinus Three Tier: Data Architecture Turned Upside Downβ by @hannes.muehleisen.org is out now.
www.youtube.com/watch?v=DxwD...
SQL Arena Planner Ranking (November 2025)
New database leaderboard from Yellowbrick ranks the quality of DBMS optimizer estimates and plans. They only evaluate TPC-H for now and report results for Postgres + DuckDB + MSSQL: sql-arena.com/components/p...
Repo: github.com/sql-arena/db...
LinkedIn Group: www.linkedin.com/groups/15775...
Today's Future Data Systems Seminar Speaker: Ian Cook (@ian.columnar.tech) will present @columnar.tech's work on Apache Arrow's database connectivity API (ADBC). ADBC is available in modern DBMSs. Zoom talk open to public at 4:30pm ET. YouTube video available after: db.cs.cmu.edu/events/futur...
Today's Future Data Systems Seminar Speaker: Will Manning (@willmanning.com) will present @spiraldb.com's Vortex file format. Vortex is now a @linuxfoundation.org project. Zoom talk open to public at 4:30pm ET. YouTube video available after: db.cs.cmu.edu/events/futur...
Processing 100Tb of CSV files on a single machine is insane, little over 1hr per query, even if on a powerful AWS instance. Question heavily the need for complex systems when this is whatβs possible now. Canβt wait for full write-up. Incredible work.
duckdb.org/2025/10/09/b...
Itβs interesting the tradeoffs if the main goal is no operating cost and decent startup time. Definitely painful to develop on regularly but for a one and done this makes a lot of sense. I wonder if Rust compile times will come down further one day.
Taking the DuckDb hoodie on a trip. Not exactly Amsterdam but Iβve heard they like columnar databases here too.
I didnβt quite make it in time for Hive filtering lazy list to speed up filtering Hive folder with many partitions, but will pick up again before next release w/ luck πββοΈ github.com/duckdb/duckd...
Congrats to DuckDB team on LTS release w/ many great improvements! Hidden among them you can now use Hive filtering with read_blob, and SHOW TABLES FROM specific db w/o USE.
π DuckDB 1.4.0 is out! This is our first LTS release which comes with *one year of community support*. It also supports database encryption, the MERGE SQL statement and Iceberg writes.
For more details, read the announcement blog post at
duckdb.org/2025/09/16/a...
I tried loading eBird data (1.5B rows CSV ZIP) using DuckDB for fun, inspired by a Clickhouse blog post and a bit of curiosity. Both did well, DuckDB slightly faster querying and Parquet ingest, Clickhouse w/ native zip support, optimized for ingest and multitenancy. xevix.medium.com/ebird-in-duc...
Thumbnail: Saving Private Hash Join
Vol:18 No:8 β Saving Private Hash Join
π₯ Authors: Laurens Kuiper, Paul Gross, Peter Boncz, Hannes MΓΌhleisen
π PDF: https://www.vldb.org/pvldb/vol18/p2748-kuiper.pdf
Is there too much duplicated effort in data tools? I sometimes wonder about this.
xevix.medium.com/data-tool-co...
Yeah, I donβt think MS is interested in 3rd party devs so much.
Unfortunately yes, I was already going to get one for something else and this put me over the edge. Maybe I'll also build a gaming rig one day in the distant future haha.
Compiling DuckDB on Windows 11 (ARM) using UTM VM on macOS to debug Windows compile issues. It's a shame msvc doesn't exist outside of Windows, mingw/clang don't work the same and cross-compiling is tricky. Compiling takes 5-10 mins (instead of 1-2 mins native), but it works π!
Just the 1 day of data above is ~125GiB compressed, ~585GiB uncompressed. One month is about 3.75TiB compressed, or 17.5TiB. It makes sense this dataset is so popular for testing and analysis, wow.
Stretching DuckDB w/ Common Crawl, ~1.7B rows, ~300 parquet files. ~2-3s for single-column aggregations, ~2-3 mins to SUMMARIZE the data, peaking at ~12-14GB memory usage. Not exactly real-time, but the fact you can do this on a laptop with no server setups or Spark pipelines is still amazing.
Uploaded a simplified query. Had to delete and repost since no edit button on the snippets site, sorry for the spam haha.
Haha already posted in case someone benefits there too.