[CIDR '25] Adaptive Factorization Using Linear-Chained Hash Tables
vldb.org/cidrdb/pape...
Adaptive execution + factorization + WCOJ = great paper.
The best intro to factorized databases I know of is www.youtube.com/watc....
[CIDR '25] Adaptive Factorization Using Linear-Chained Hash Tables
vldb.org/cidrdb/pape...
Adaptive execution + factorization + WCOJ = great paper.
The best intro to factorized databases I know of is www.youtube.com/watc....
usefulfictions.substack.com/p/burnout-is... has had me thinking a lot about what is βrewardingβ work and trying to separate βI should be doing this to achieve things I want to have doneβ vs βI enjoy thisβ
And scour.ing/@linearizabl... now works for the interests
[VLDB '25] MD-MVCC: Multi-version Concurrency Control for Schema Changes in Azure SQL Database
www.vldb.org/pvldb/v...
A great discussion of the end-to-end impact of allowing multiple versions of schema metadata information to be live concurrently, in a real, production system.
I think scour.ing/@linearizabl... or scour.ing/feed/https:%... should work for the likes
@emschwartz.me has been super responsive to feedback about improving the signal to noise ratio too! :) There's already been a couple great rounds of adjustments, and I look forward to the continued refinement
https://scour.ing/ has gotten pretty good at surfacing what new stuff I actually want to read on the internet, better than following subreddits. You can see my feed of mostly database things at scour.ing/@linearizable. It surfaces small personal blogs particularly well.
The recording finally went great this time. I also demo'd doing a backup audio recording so that we can more reliably get a good recording to post, so hopefully the trend will continue π€
Sitting down with a coding agent and Kuzu/ladybugdb to understand how factorized representations work at the code level across query processing is worth the time and effort
βThe Manga Guide to Databasesβ fits the criteria for sure
(Which I think Iβve seen you show you have a copy of it already.)
ladybugdb.com is the fork & continue project, but I think thereβs a slightly different roadmap to be more object storage integrated
Turns out the rumor of this being an Apple acquisition was actually true: appleinsider.com/articles/26/...
Does anyone know of a good webapp or discord bot or something to help manage a reading group? Something that keeps a list of suggesting things to read, can do voting on the next thing to read, and maybe has a bit of curation support for when the to-read list gets unmanageable?
One of our recordings had spotty audio because the presenter would step to the side while talking to gesture at slides. Would that also then mean theyβd step out of line for a shotgun mic? I have no idea how precisely directional those actually are.
Iβm willing to spend the money, I just know nothing about audio equipment. If you have some audio gadget friend to give a trusted answer for βwhat type of microphone and which product should I go buy for this?β thatβd be great. I think I can borrow a cheap lapel mic as a test to see if thatβs good.
A 2026 hope of mine is to get our own recording setup figured out so that we can more reliably get recordings up. Weβve been about 50/50, and I feel bad for the speakers when they come give a great talk, but then the recording doesnβt work out for whatever reason. (Like the Morel and QOaaS talks π’)
Our next event will be on January 21st, featuring speakers from (the just-finishing) CIDR! Come to Databricks to hear about:
* DuckDB on xNVMe by @pinartozun.bsky.social of ITU
* Spilling in QP by Maximilian Kuschewski of TUM
* NPUs in DBs by Alexander Baumstark of TU-Ilmenau
luma.com/8a54z94d
"Diva: Making MVCC Systems HTAP-Friendly" dl.acm.org/doi/pdf/10.1... also feels underappreciated, as they literally did an implementation in *both* mysql and postgres.
Seoul National University's DBX Lab has been looking into this area overall for a little while dbx.snu.ac.kr/publications
I've seen a lot of MySQL and Postgres storage discussion as MVCC Wars: VACUUM vs Undo Log. I'd love to see an implementation of a Time Split B-Tree (dl.acm.org/doi/10.1145/...). It's a simple, yet very different design point. You gain the MVCC scan benefits of a CoW BTree, but can be multi-writer.
"The Case for 2-Tree for Skewed Datasets" www.cidrdb.org/cidr2023/pap... is a really fun read paired with Bf-tree, as the two papers try to solve the same high-level problem of not keeping cold data in cache, but with two very different approaches.
And, if you're interested in other reading...
(The uptick occurred after bsky.app/profile/benj... )
Iβve recently seen multiple, unrelated instances of people referencing Bf-trees. Good job, @benjdd.com.
Do you have to tell it in the prompt anything about βplease look up any key functionβs documentationβ or something to get the tools to be used, or do you generally see it making reasonable decisions already?
It's a notable bit slower, but gemini has a surprisingly generous free tier for its CLI, and I'd rather have slower and correct than the loops of incorrect fixes I'd be sent on before.
Maybe there's some "fetch rust docs" tool that'd be even more helpful that I don't know about?
Asking a coding agent to run `cargo build` and read referenced source files for context has made LLMs significantly more helpful and accurate at actually understanding why a compilation error is happening and being able to explain an appropriate fix. Much better than copy-pasting into online LLMs.
Looks like you manifested a paper
arxiv.org/abs/2512.12957
gist.github.com/thisismiller...
After ~2015, the focus seemed to shift to looking at stats on SSD failures from large deployments, but that's no longer a "does this SSD work right?" but a "how long until it dies?", and so I don't get why the latter replaced the former.
I had once started compiling SSD powerfault testing papers, and found that academia testing SSDs stopped ~2015. π±
If you still have any notes of all the sources you found and looked at, Iβd greatly appreciate a copy to update the posts with anything Iβve missed!