Arij Riabi (@arijriabi)

Thrilled to release Gaperon, an open LLM suite for French, English and Coding 🧀

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social

07.11.2025 21:11 👍 35 🔁 18 💬 1 📌 4

Can We Fix Social Media? Testing Prosocial Interventions using Generative Social Simulation Social media platforms have been widely linked to societal harms, including rising polarization and the erosion of constructive debate. Can these problems be mitigated through prosocial interventions?...

We built the simplest possible social media platform. No algorithms. No ads. Just LLM agents posting and following.

It still became a polarization machine.

Then we tried six interventions to fix social media.

The results were… not what we expected.

arxiv.org/abs/2508.03385

06.08.2025 08:24 👍 301 🔁 106 💬 14 📌 45

I am stuck at just hot summer haha

20.06.2025 16:42 👍 2 🔁 0 💬 1 📌 0

ModernBERT or DeBERTaV3?

What's driving performance: architecture or data?

To find out we pretrained ModernBERT on the same dataset as CamemBERTaV2 (a DeBERTaV3 model) to isolate architecture effects.

Here are our findings:

14.04.2025 15:41 👍 44 🔁 15 💬 3 📌 0

PhD defence of Arij Riabi, 18 March 2025

Congratulations to @arijriabi.bsky.social who successfully defended her PhD “Small is Beautiful: Addressing Resource Scarcity, Language Variation, & Transfer Challenges for Automatic Detection of Harmful Language” last Tuesday, supervised by @zehavoc.bsky.social & @openlaurent.bsky.social 👩‍🎓🎉

25.03.2025 10:46 👍 21 🔁 3 💬 0 📌 0

Haha no stil didn't get my yoyo (yet)

20.03.2025 09:20 👍 2 🔁 0 💬 0 📌 0

Hahahah yes I arrived at 1 am they were all half asleep but we still celebrated.

20.03.2025 09:14 👍 1 🔁 0 💬 1 📌 0

a man wearing a tie and a blue shirt is screaming in a kitchen ALT: a man wearing a tie and a blue shirt is screaming in a kitchen

20.03.2025 09:03 👍 0 🔁 0 💬 1 📌 0

A special thank you to my colleagues at ALMAnaCh @inriaparisnlp.bsky.social and everyone who has been part of this journey.

#PhD #NLP #research

20.03.2025 08:44 👍 4 🔁 0 💬 1 📌 0

I am deeply grateful to my supervisors, @zehavoc.bsky.social and @openlaurent.bsky.social , as well as my committee members, Elena Cabrio, Sara Tonelli, Benjamin Piwowarski and @marinecarpuat.bsky.social for their valuable feedback and support.

20.03.2025 08:44 👍 3 🔁 0 💬 1 📌 0

I am excited to share that I have successfully defended my PhD, "Addressing Resource Scarcity, Language Variation, and Transfer Challenges for Automatic Detection of Harmful Language." 🎉
👩‍🎓👩‍🎓🎉
@inriaparisnlp.bsky.social
@sorbonne-universite.fr

20.03.2025 08:44 👍 32 🔁 0 💬 4 📌 1

🎉 🌍✍️ I'm thrilled to announce that our paper, "Common Ground, Diverse Roots: The Difficulty of Classifying Common Examples in Spanish Varieties", co-authored with @arijriabi.bsky.social and @zehavoc.bsky.social, has been accepted for the #VarDial2025 workshop during #COLING2025! 🎉 1/5

27.12.2024 17:02 👍 6 🔁 2 💬 1 📌 0

most people want a quick and simple answer to why AI systems encode/exacerbate societal and historical bias/injustice and due to the reductive but common thinking of "bias in, bias out," the obvious culprit often is training data but this is not entirely true

1/

24.11.2024 16:26 👍 598 🔁 217 💬 26 📌 42

HTR-United HTR-United is a catalog and an ecosystem for sharing and finding ground truth for optical character or handwritten text recognition (OCR/HTR).

Now that I am on bluesky, let me take you again on a threaded tour of HTR-United (#HTR_United), a project founded and led by @ponteineptique.bsky.social and I since September 2021. Its main goal is to facilitate finding and sharing open datasets to train HTR and OCR models!

htr-united.github.io

30.10.2023 10:48 👍 4 🔁 5 💬 1 📌 0

Arij Riabi

Latest posts by Arij Riabi @arijriabi