Poster advertising lectures on "Raisonnement Philologique et Modèles Informatiques" stating at 4pm, Thursday, March 12, at 54 Boulevard Raspail, Paris.
Paris friends! Amis parisiens ! This Thursday is the first of four public lectures I'm giving on AI and philology, broadly defined: "Philological Reasoning and Computational Models." The advertisement is in French, but the lectures are in English. I'd also love to meet while I'm here in March! 1/
09.03.2026 08:34
👍 21
🔁 15
💬 1
📌 1
@transkribus.bsky.social Is there a public API to access our documents and documents export, to streamline their publication ? I am not able to find any information on such an option that is not > 2 or 3 years.
06.03.2026 14:30
👍 0
🔁 0
💬 0
📌 0
Excellent venue for computational humanities work, colocated with ACL in San Diego on July 6. Please share!
04.03.2026 20:13
👍 15
🔁 8
💬 0
📌 0
Original post on fedihum.org
We’re looking for a Research Data Engineer (m/f/x) (3,5 years). If you have a #DigitalHumanities profile with experience in #TEI encoding, #OCR / #HTR, and #IIIF (or any of those and are willing to learn the rest), get in touch! The full time position can be split, so if (for whatever reason) […]
03.03.2026 16:18
👍 2
🔁 19
💬 1
📌 0
We have this experience as well, a paper is coming up later this week with details about a specific domain...
23.02.2026 14:09
👍 3
🔁 1
💬 2
📌 0
Hold my non alcoholic beverage 😁
It's not on my to do list but if a team wants to look at it with us...
21.02.2026 17:11
👍 3
🔁 0
💬 0
📌 1
Hotel de la région, région qui penche bien a droite depuis des années, pas étonnant non ?
Dégoûtant, mais pas étonnant
21.02.2026 14:36
👍 1
🔁 0
💬 0
📌 0
Extrême droite à Lyon — Wikipédia
Y a une page wikipédia extrême droite à Lyon. On en est là fr.wikipedia.org/wiki/Extr%C3...
21.02.2026 14:13
👍 0
🔁 0
💬 1
📌 0
21.02.2026 14:12
👍 0
🔁 0
💬 0
📌 0
La région Rhône Alpes quoi...
21.02.2026 14:08
👍 0
🔁 0
💬 1
📌 0
Setting aside the original issue, on which we agree 😉, you are right. This particular manuscript and another have no open data in this paper apparently...
19.02.2026 18:43
👍 1
🔁 0
💬 0
📌 0
Or, an error in the paper presentation (not unseen in the humanities ) where they make us expect consecutive line by design and this is not. I would rather expect this than bad curation given that they actually spend time looking at errors.
19.02.2026 15:40
👍 0
🔁 0
💬 1
📌 0
Thank you for the catch ! Should have been more careful here...
19.02.2026 07:11
👍 0
🔁 0
💬 1
📌 0
Manuscrits de la Médiathèque du Grand Troyes. Manuscrits issus de la bibliothèque de Clairvaux. Ms. 1600
Manuscrits de la Médiathèque du Grand Troyes. Manuscrits issus de la bibliothèque de Clairvaux. Ms. 1600 -- -- manuscrits
You are completely right. I was so baffled by the dya9 bad interpretation that I just did not check the rest...
The original paper by Aguilar showed stitched lines most of the time and trusted it...
Paper(P.17): hal.science/hal-04716654/
Manuscript: gallica.bnf.fr/ark:/12148/b...
19.02.2026 06:41
👍 2
🔁 0
💬 1
📌 0
And for those who want the answer:
❌ dyaconus -> diabolus
❌ in futurum -> infusum
18.02.2026 09:18
👍 1
🔁 0
💬 1
📌 0
There are plans :)
17.02.2026 19:36
👍 0
🔁 0
💬 0
📌 0
We also show that we are far from done, specifically for a complicated language like Old French.
But we
(1) defined the issue,
(2) propose a first solution that enables pre-annotation of larger dataset and
(3) offer an alternative to less trustable models that go beyond ATR.
17.02.2026 18:11
👍 3
🔁 1
💬 0
📌 0
👉 We propose Pre-Editorial Normalization (PEN):
An intermediate layer between:
📝 graphemic ATR output
📖 fully edited text
Goal: preserve palaeographic fidelity + enable usability.
Keep two layer, ATR output and normalization, with aligned token to go back to the source.
17.02.2026 18:11
👍 2
🔁 1
💬 1
📌 0
Recent ATR progress—especially with palaeographic datasets like CATMuS—has improved access to medieval sources.
But:
❌ Raw outputs are hard to use
❌ Fully normalized models over-normalize & hallucinate
There’s a methodological gap.
17.02.2026 18:11
👍 2
🔁 1
💬 1
📌 0
If I give you the text
📚 omnium peccatorum quia ex quo dyaconus quando esset in futurum, stultus esset
Can you find the ATR error without the manuscript ?
Probably not.
ATR models that predict text and normalize in one go generate trustable text, but prevent detecting issues.
17.02.2026 18:11
👍 1
🔁 1
💬 2
📌 0
📄 New paper:
Pre-Editorial Normalization for Automatically Transcribed Medieval Manuscripts in Old French and Latin
Thibault Clérice, @rachelbawden.bsky.social , Anthony Glaise, Ariane Pinche, @dasmiq.bsky.social (2026) arxiv.org/abs/2602.13905
We introduce Pre-Editorial Normalization (PEN).
🧵⬇️
17.02.2026 18:11
👍 23
🔁 9
💬 1
📌 2
Specifications
The Distributed Text Services (DTS) Specification defines a Hypermedia-Driven Web API for working with collections of text as machine-actionable data.
We are pleased to announce the official release of Distributed Text Services (DTS) v1.0 — a stable specification, ready for broad adoption.
This release is the result of years of collaborative development, and community feedback.
The specification are available at: dtsapi.org/specificatio...
16.02.2026 13:08
👍 15
🔁 11
💬 1
📌 0
Thanks
16.02.2026 10:34
👍 0
🔁 0
💬 1
📌 0
Oh ! Sorry, I meant @danielvanstrien.bsky.social and did not see I clicked on the wrong username...
16.02.2026 09:57
👍 0
🔁 0
💬 1
📌 0