Another one for English monarchs. rupertlinacre.com/english_mona...
Another one for English monarchs. rupertlinacre.com/english_mona...
I have been enjoying playing Sporcle's countries of the world quiz with my son, so I made a fancier version. Play here: rupertlinacre.com/country_quiz/
We're particularly proud of how easy this is to use. The following 2 minute video demos the end-to-end process of matching 100k council tax records.
This process is more fully documented here:
moj-analytical-services.github.io/uk_address_m...
Docs: moj-analytical-services.github.io/uk_address_m...
Code: github.com/moj-analytic...
Discussion forum: github.com/moj-analytic...
The full end-to-end process from raw OS data to 100k matched addresses can be completed in less than a minute if matching to a small geographic area such as a local authority, and about 11 minutes for the whole UK (including one-time setup). Additional 100k records take <1m
In addition, we have published reproducible accuracy benchmarks using publicly available labelled datasets. This allows it to be compared head-to-head with other approaches.
Key features:
- Python only. Set up in seconds, runs on a laptop. No separate infrastructure or services needed.
- Fast. Match 100,000 addresses in ~30 seconds.
- We provide an automated build pipeline for users wishing to match to Ordnance Survey data.
UK Address Matcher logo
We are pleased to release `uk_address_matcher`, a free Python package for address matching and geocoding, developed by Tom Hepworth and me.
The package has several aims: simplicity, speed and accuracy.
π£ NEW! Iβve just released the BIGGEST and perhaps most creative project Iβve ever worked on!
βSearching for Birdsβ searchingforbirds.visualcinnamon.com π€
A project, an article, an exploration that dives into the data that connects humans with birds, by looking at how we search for birds.
In case you missed it: New blog: Respectful use of AI in software development teams
www.robinlinacre.com/respectful_u...
LLMs are increasingly able to write production quality code. But what cognitive work can be delegated to LLMs without damaging the health of the team?
π I'm a FOSS dev. My blog is here : www.robinlinacre.com.
For anyone with FOMO wondering whether to pay for Opus 4.5/Claude Code, my experience is that OpenAI Codex is very similar in performance. i.e. both are excellent, but Claude Code is not a magic unlock
Excited to be a part of the initiative to "Move Fast and Fix Things", announced in Chief Secretary to the Prime Minister's speech today. One measure is an expansion of the No10 Innovation Fellowship, for which we've launched a new website!
fellows.ai.gov.uk
Speech: www.gov.uk/government/s...
I've wondered about this too. Feels like it'd be well suited to a Kaggle type problem where you're just after the most accurate predictive model. Feels like Claude should be able to chug away trying lots of different types of approaches, though prob need to be a bit careful about reward hacking
New blog: Respectful use of AI in software development teams
robinlinacre.com/respectful_u...
LLMs are increasingly able to write production quality code. But what cognitive work can be delegated to LLMs without damaging the health of the team?
Blows my mind how long this would have taken 5 years ago. This is where the edtech revolution is IMO, we just need experts in pedagogy to learn how to vibe code. My guess is in <3 years we'll have systems that can almost oneshot entire apps like this, inc all assets
Made a simple game to help my daughter learn her alphabet. Took about 1 day for full code, >200 images >200 voice files and music
nano-banana-pro is incredible. But it's also so quick to vibe code image and audio processing scripts.
robinlinacre.com/bee_letters/
github.com/RobinL/bee_l...
I struggle to find no-nonsense, free and 'fun'(ish) maths games for my son (7yo) so I have been making a few
Here's another one: Maths vs monsters. This is his fav so far
rupertlinacre.com/maths_vs_mon...
Code:
github.com/rupertLinacr...
Other games/maths utilities:
rupertlinacre.com
OpenUK Awards 25 Open Data Category sponsored by Open Data Institute, Shortlist is live, congratulations to the shortlisted nominees: Ministry of Justice UK Splink Team (@robinlinacre.bsky.social), OpenActive, and UK Power Networks (Yiu-Shing Pang) πΎπ₯π
#openukawards #opensource #opendata
Screenshot of sample of Islington's Council Tax address data, visualised in Google Earth
More progress on #openaddresses:
Islington Council in London has released its Council Tax address list for re-use as #opendata under the Open Government Licence www.owenboswarva.com/blog/post-ad...
I've made a geocoded version by adding coordinates from ONS
#FOI #localgov #UKhousing #proptech
No worries - thanks for the report on the repo, we'll take a look
(Incidentally, uk_address_matcher should work ok for non-UK addresses, that's just no our focus. See examples here for how to use the package github.com/moj-analytic...)
Did you try github.com/moj-analytic...?
The trie is WIP, but the idea is that it will be used as an initial step to skim off the easy ones. The remainder will go through to the main matching phase which already exists in uk_address_matcher, but is more computationally intensive
New β¨interactiveβ¨ explainer: Address matching using a fault tolerant trie:
robinlinacre.com/fault_tolera...
Which illustrates a powerful technique for address matching that we're currently working on building into uk_address_matcher (github.com/moj-analytic...)
You select the columns you want, and it handles the joins for you.
It's just a rough sketch for now. I feel like it must have done before, but couldn't find anything. Feedback welcome!
When working a complex postgres schema, I find it time consuming to figure out the joins.
I had an idea: a 'join generator' that traverses the relationship graph for you, and writes the joins.
You give it a dump of the postgres schema, and it gives you a UI.
www.robinlinacre.com/vite_live_pg...
We're working on a DuckDB community extension called `splink_udfs` to add some record linkage related functions to DuckDB. It's currently very much WIP, but you can already use it wherever you're using DuckDB.
github.com/moj-analytic...
If you're using Splink with DuckDB you should see significant speed improvements by updating to DuckDB 1.3.x. You can also add more granularity to your comparison levels statements without an impact on run times. Depending on your model spec, it could be twice as fast or better.
Then give output to VS Code copilot in agent mode to implement