Nicolas Yax's Avatar

Nicolas Yax

@nicolasyax

PhD student working on the cognition of LLMs | HRL team - ENS Ulm | FLOWERS - Inria Bordeaux

69
Followers
83
Following
30
Posts
14.11.2024
Joined
Posts Following

Latest posts by Nicolas Yax @nicolasyax

Post image Post image Post image

New blog post: The Phylogenetics of Artifacts β€” inferring the evolution of cultural objects, artificial life forms, and language models.

From cat genetics to ancient myths to LLMs. 🧬 1/n

10.03.2026 12:28 πŸ‘ 12 πŸ” 6 πŸ’¬ 3 πŸ“Œ 0
Preview
The Phylogenetics of Artifacts - A deep dive into the evolution of cultural objects, artificial life forms and language models Developmental Systems, a Blog of the Flowers Lab

Wow, some of my old #EvolutionOfLanguage #EoL pals may have just done something huge for #AILaw data attribution #LLM #AIGovernance Specifically
@pyoudeyer.bsky.social @nicolasyax.bsky.social @stepalminteri.bsky.social

Evolutionary biology can track LLM phylogeny!
developmentalsystems.org/phylolm

10.03.2026 11:01 πŸ‘ 7 πŸ” 5 πŸ’¬ 1 πŸ“Œ 0

Excited to announce our symposium on how AI and humans shape each other
β€œHumans and Artificial Minds: Mutual Influences”
9 Jan at ENS Paris.
Free registration here www.eventbrite.com/e/humans-and...

06.01.2026 08:58 πŸ‘ 3 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0
Post image

Will the influx of synthetic data lead to uniform #ModelCollapse across the internet?
Our recent #EMNLP2025 (Oral) paper suggests a nuanced picture: different collapse dynamics might emerge in different internet domains based on the properties of human data in those domains! 🧡

18.12.2025 14:37 πŸ‘ 1 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Here is the link to (freely) sign-up for our symposium!

Humans and Artificial Minds: Mutual influences, for better and for worse

www.eventbrite.com/e/humans-and...

15.12.2025 10:54 πŸ‘ 6 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0
Post image

Excited to announce our symposium on how AI and humans shape each other
β€œHumans and Artificial Minds: Mutual Influences”
9 Jan at ENS Paris.
Talks by @smfleming.bsky.social, Valeria Giardino, Silvia Tulli, @thecharleywu.bsky.social, Laurence Devillers & @summerfieldlab.bsky.social .
Program ↓

10.12.2025 10:40 πŸ‘ 21 πŸ” 8 πŸ’¬ 0 πŸ“Œ 3
GPT style cartoon of a debate between a smiling Skinner and a angry Chomsky, while in the back, a robot is reading "verbal behavior"

GPT style cartoon of a debate between a smiling Skinner and a angry Chomsky, while in the back, a robot is reading "verbal behavior"

I’m happy to share a short opinion piece I’ve just finished, where I revisit the famous Skinner vs. Chomsky exchange on how language is learned through the lens of today’s large language models (before getting mad read the rest) 1/n
osf.io/preprints/ps...

03.12.2025 10:21 πŸ‘ 15 πŸ” 5 πŸ’¬ 1 πŸ“Œ 0
Post image

New (revised) preprint with @thecharleywu.bsky.social
We rethink how to assess machine consciousness: not by code or circuitry, but by behavioral inferenceβ€”as in cognitive science.
Extraordinary claims still need extraordinary evidence.
πŸ‘‰ osf.io/preprints/ps...
#AI #Consciousness #LLM

08.10.2025 09:02 πŸ‘ 16 πŸ” 4 πŸ’¬ 0 πŸ“Œ 1
Preview
Relative Value Encoding in Large Language Models: A Multi-Task, Multi-Model Investigation Abtract. In-context learning enables large language models (LLMs) to perform a variety of tasks, including solving reinforcement learning (RL) problems. Given their potential use as (autonomous) decis...

🧠 New paper in Open Mind!

We show that LLM-based reinforcement learning agents encode relative reward values like humans, even when suboptimal and display a positivity bias.

Work led by William Hayes w/ @nicolasyax.bsky.social

doi.org/10.1162/opmi...

#AI #LLM #RL

26.05.2025 18:15 πŸ‘ 10 πŸ” 3 πŸ’¬ 1 πŸ“Œ 0
Preview
Generating Computational Cognitive Models using Large Language Models Computational cognitive models, which formalize theories of cognition, enable researchers to quantify cognitive processes and arbitrate between competing theories by fitting models to behavioral data....

Preprint update, co-led with @akjagadish.bsky.social, with @marvinmathony.bsky.social, Tobias Ludwig and @ericschulz.bsky.social!

26.05.2025 10:08 πŸ‘ 16 πŸ” 7 πŸ’¬ 0 πŸ“Œ 0
Post image

Curious about LLM interpretability and understanding ? We borrowed concepts from genetics to map language models, predict their capabilities, and even uncovered surprising insights about their training !

Come see my poster at #ICLR2025 3pm Hall 2B #505 !

26.04.2025 02:03 πŸ‘ 5 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Charting and Navigating Hugging Face's Model Atlas As there are now millions of publicly available neural networks, searching and analyzing large model repositories becomes increasingly important. Navigating so many models requires an atlas, but as mo...

If you are interested in this line of research of mapping LLMs you might also want to check the amazing work of Eliahu Horwitz arxiv.org/abs/2503.10633 and Momose Oyama arxiv.org/abs/2502.16173 10/10

24.04.2025 13:15 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

In short, PhyloLM is a cheap and versatile algorithm that generates useful representations for LLMs that can have creative applications in pratice. 9/10
paper : arxiv.org/abs/2404.04671
colab : colab.research.google.com/drive/1agNE5...
code : github.com/Nicolas-Yax/...
ICLR : Saturday 3pm Poster 505

24.04.2025 13:15 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
PhyloLM - a Hugging Face Space by nyax This app allows you to explore and compare language models through various visualizations, including similarity matrices, 2D scatter plots, and tree diagrams. You can search for models by name, adj...

A PhyloLM collaborative Huggingface space is available to try the algorithm and visualize maps : huggingface.co/spaces/nyax/... The Model Submit button has been temporarily suspended for technical reasons but it should be back very soon ! 8/10

24.04.2025 13:15 πŸ‘ 2 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0
Post image

By using code related contexts we can obtain a fairly different map. For example we notice that Qwen and GPT-3.5 have a very different way of coding compared to the other models which was not visible on the reasoning map. 7/10

24.04.2025 13:15 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

The contexts choice is important as it reflects different capabilities of LLMs. Here on a general reasoning type of context we can plot a map of models using UMAP. The larger the edge, the closer models are from each other. Models on the same cluster are even closer ! 6/10

24.04.2025 13:15 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

It can also measure quantization efficiency by observing the behavioral distance between LLM and quantized versions. In the Qwen 1.5 release, GPTQ seems to perform best. This new concept of metric could provide additional insights to quantization efficiency. 5/10

24.04.2025 13:15 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Aside from plotting trees, PhyloLM similarity matrix is very versatile. For example, running a logistic regression on the distance matrix makes it possible to predict performance of new models even from unseen families with good accuracy. Here is what we got on ARC. 4/10

24.04.2025 13:15 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Not taking into account these requirements can still produce efficient distance vizualisation trees. However it is important to remember they do not represent evolutionary trees. Feel free to zoom in to see model names. 3/10

24.04.2025 13:15 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Phylogenetic algorithms often require common ancestors to not appear in the objects studied but are clearly able to retrieve the evolution of the family. Here is an example in the richness of open-access model : @teknium.bsky.social @maximelabonne.bsky.social @mistralai.bsky.social 2/10

24.04.2025 13:15 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

We build a distance matrix from comparing outputs of LLMs to a hundred of different contexts and build maps and trees from this distance matrix. Because PhyloLM only requires sampling very few tokens after a very short contexts the algorithm is particularly cheap to run. 1/10

24.04.2025 13:15 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

πŸ”₯Our paper PhyloLM got accepted at ICLR 2025 !πŸ”₯
In this work we show how easy it can be to infer relationship between LLMs by constructing trees and to predict their performances and behavior at a very low cost with @stepalminteri.bsky.social and @pyoudeyer.bsky.social ! Here is a brief recap ⬇️

24.04.2025 13:15 πŸ‘ 15 πŸ” 5 πŸ’¬ 3 πŸ“Œ 2
Preview
MAGELLAN: Metacognitive predictions of learning progress guide... Open-ended learning agents must efficiently prioritize goals in vast possibility spaces, focusing on those that maximize learning progress (LP). When such autotelic exploration is achieved by LLM...

πŸš€ Introducing 🧭MAGELLANβ€”our new metacognitive framework for LLM agents! It predicts its own learning progress (LP) in vast natural language goal spaces, enabling efficient exploration of complex domains.🌍✨Learn more: πŸ”— arxiv.org/abs/2502.07709 #OpenEndedLearning #LLM #RL

24.03.2025 15:09 πŸ‘ 9 πŸ” 3 πŸ’¬ 1 πŸ“Œ 4

we are recruiting interns for a few projects with @pyoudeyer
in bordeaux
> studying llm-mediated cultural evolution with @nisioti_eleni
@Jeremy__Perez

> balancing exploration and exploitation with autotelic rl with @ClementRomac

details and links in 🧡
please share!

27.11.2024 17:43 πŸ‘ 6 πŸ” 6 πŸ’¬ 1 πŸ“Œ 0
Video thumbnail

Putting some Flow Lenia here too

22.11.2024 09:51 πŸ‘ 4 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Video thumbnail

1/⚑️Looking for a fast and simple Transformer baseline for your RL environment in JAX ?
Sharing my implementation of transformerXL-PPO: github.com/Reytuag/tran...
The implementation is the first to attain the 3rd floor and obtain advanced achievements in the challenging Craftax

22.11.2024 10:15 πŸ‘ 3 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Preview
Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models Recent statements about the impressive capabilities of large language models (LLMs) are usually supported by evaluating on open-access benchmarks. Considering the vast size and wide-ranging sources of...

Related work on contamination in LLMs :
arxiv.org/abs/2402.15938 Dong et al. 2024
arxiv.org/abs/2310.15007 Meeus et al. 2023
arxiv.org/abs/2310.17623 Oren et al. 2024

15.11.2024 13:47 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Assessing Contamination in Large Language Models: Introducing the LogProber method In machine learning, contamination refers to situations where testing data leak into the training set. The issue is particularly relevant for the evaluation of the performance of Large Language Models...

LogProber paper : www.arxiv.org/abs/2408.14352
git : github.com/Nicolas-Yax/...
collab : colab.research.google.com/drive/1GDbmE...

15.11.2024 13:47 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

It is part of a research agenda to open the LLM black box and provide tools for researchers to better interact with models in a more transparent manner. The last paper in this agenda was PhyloLM proposing methods to investigate the phylogeny of LLMs arxiv.org/abs/2404.04671 15/15

15.11.2024 13:47 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

This method was first introduced in our paper Studying and improving reasoning in humans and machines investigating the evolution of cognitive biases in language models. www.nature.com/articles/s44... 14/15

15.11.2024 13:47 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0