Finishing this 2019 book by Melanie Mitchell. It is mind boggling how much has changed in 7 or 8 years. “Human level language processing remains a distant goal.” We’re not there yet, but we’ve moved a lot faster than she predicted.
Finishing this 2019 book by Melanie Mitchell. It is mind boggling how much has changed in 7 or 8 years. “Human level language processing remains a distant goal.” We’re not there yet, but we’ve moved a lot faster than she predicted.
4 days ago Idlout endorsed Lewis. Today she crossed to the Liberals. I don’t understand politicians.
I've added in the Princely States, so I think the knowledge graph is now complete. The visualization only shows 25 of the more significant Princely States. You can find the code to create your own version of the database on my GitHub (linked in the thread).
Listening to Youth: Historicising & Challenging Parental Rights Discourse activehistory.ca/blog/2026/03...
English 18th century handwriting is handled very well. Other teams are working on classical Chinese with very promising results. Early modern print is very good while 19th century is excellent. We’re a long way from 2015.
My student @historyjacob.bsky.social has shown the new OCR tools work very well with early modern English print text. We use to struggle with the long s and many other issues. This opens up NER on sources that use to be too messy. He also developed a fine tuned NER model for Atlantic commodities.
How are historians rethinking environmental and social history via improved OCR of imperial archives?
Join us this Wednesday 3pm UK time to hear from @jimclifford.bsky.social and @historyjacob.bsky.social - registration link below.
Saskatchewan is on daylight savings year round. It is great. More evening light year round and no clock changes.
I'm currently missing the Princely States for the British Empire. I'm also missing most of the French, Dutch, Spanish, Portuguese, Japanese and other empires in the modern period. I'd be happy to collaborate on building this out to include other empires.
I've been working on a foundation knowledge graph of the British Empire off and on for a few months. I think I have a relatively complete and clean draft done and visualized. I'm also sharing the Cypher so you can take and build on this work.
jburnford.github.io/text_as_data...
This is true, if “the good life” means “drinking heavily and networking with other sons of the gentry, followed perhaps by an easy job in the Church,” and “the humanities” mean “Latin and theology.”
The idea that America has lost its post-1945 wars by being too ‘soft’ is deeply embedded in a certain constituency of US minds.
During Vietnam the US dropped an equivalent of 100 Hiroshima bombs on SE Asia.
Softness wasn’t the issue.
Maybe this speaks to my social and social media networks, but the NDP leadership race isn’t breaking through. I’ve had zero conversations about the race or who should win. I left the party years ago, but I’m still a little surprised how little I’m hearing compared to the last leadership race.
Nailed it
all the way down
Join the Lancaster-Manchester Environmental #DH Seminar on March 11 @ 3pm UK (online) for a talk by @jimclifford.bsky.social & @historyjacob.bsky.social:
"Solving OCR: Using olmOCR to Follow Commodities across the British World"
www.eventbrite.co.uk/e/solving-oc...
#dhist #ocr #envhist 🗃️
(a) a foundational layer showing the changing geography of colonial power; (b) mapping career progression between colonies; (c) a future system to compare how colonial officials write about places and events in contrast to other voices in the archives; (d) who knows…
I'm working on a text mining/knowledge graph project using the Indian Office Lists and Colonial Office Lists are the source material. I've extracted the biographies of Indian Office officials from the 1937 list into a knowledge graph visualized here:
jburnford.github.io/indianoffice...
My biannual reminder, if we don’t agree to peer review requests, the whole system collapses.
And, faculty in at research universities with reasonable teaching loads, need to carry most of the weight. If you or your students want to publish, you need to say yes, even when you are busy.
This feels like a pitch for a Dorchester Review article.
Both are possible and there is value in testing what it can find. But you need to triangulate the answers, follow the footnotes, test the arguments and be really careful.
This problem also shows up when the answer isn’t known. This is where academics and students need to be very careful. Gemini will write a full essay to answer your question whether it finds new evidence pointing to the truth or comes up empty and instead piece a compelling but wrong answer.
This is an important measure of LLMs. The underlying engineering makes them want to answer all questions. But what if the question is nonsense? The examples are extreme, but you could seem subtle misunderstanding in questions steering a discussion off the rails.
github.com/petergpt/bul...
New working paper out: "Open Maps: New Research Directions and Workflows for Digitized Historical Cartographic Material" led by Vincent Baptist & Jules Schoonman (TU Delft) #dh #maps
openmapsmeeting.nl/publications...
Making progress in my efforts to parse all of the articles from the first 8 editions of the Encyclopaedia Britannica.
120,039,792 words; 114,359 articles; 133 volumes.
jburnford.github.io/early_encycl...
There a big problems with all of these companies, but OpenAI is competing hard to rank as the worst AI lab.
www.nytimes.com/2026/02/27/t...
I remember when Canada couldn’t afford to act because of China…