New blog post: The Phylogenetics of Artifacts β inferring the evolution of cultural objects, artificial life forms, and language models.
From cat genetics to ancient myths to LLMs. 𧬠1/n
New blog post: The Phylogenetics of Artifacts β inferring the evolution of cultural objects, artificial life forms, and language models.
From cat genetics to ancient myths to LLMs. 𧬠1/n
Wow, some of my old #EvolutionOfLanguage #EoL pals may have just done something huge for #AILaw data attribution #LLM #AIGovernance Specifically
@pyoudeyer.bsky.social @nicolasyax.bsky.social @stepalminteri.bsky.social
Evolutionary biology can track LLM phylogeny!
developmentalsystems.org/phylolm
Excited to announce our symposium on how AI and humans shape each other
βHumans and Artificial Minds: Mutual Influencesβ
9 Jan at ENS Paris.
Free registration here www.eventbrite.com/e/humans-and...
Will the influx of synthetic data lead to uniform #ModelCollapse across the internet?
Our recent #EMNLP2025 (Oral) paper suggests a nuanced picture: different collapse dynamics might emerge in different internet domains based on the properties of human data in those domains! π§΅
Here is the link to (freely) sign-up for our symposium!
Humans and Artificial Minds: Mutual influences, for better and for worse
www.eventbrite.com/e/humans-and...
Excited to announce our symposium on how AI and humans shape each other
βHumans and Artificial Minds: Mutual Influencesβ
9 Jan at ENS Paris.
Talks by @smfleming.bsky.social, Valeria Giardino, Silvia Tulli, @thecharleywu.bsky.social, Laurence Devillers & @summerfieldlab.bsky.social .
Program β
GPT style cartoon of a debate between a smiling Skinner and a angry Chomsky, while in the back, a robot is reading "verbal behavior"
Iβm happy to share a short opinion piece Iβve just finished, where I revisit the famous Skinner vs. Chomsky exchange on how language is learned through the lens of todayβs large language models (before getting mad read the rest) 1/n
osf.io/preprints/ps...
New (revised) preprint with @thecharleywu.bsky.social
We rethink how to assess machine consciousness: not by code or circuitry, but by behavioral inferenceβas in cognitive science.
Extraordinary claims still need extraordinary evidence.
π osf.io/preprints/ps...
#AI #Consciousness #LLM
π§ New paper in Open Mind!
We show that LLM-based reinforcement learning agents encode relative reward values like humans, even when suboptimal and display a positivity bias.
Work led by William Hayes w/ @nicolasyax.bsky.social
doi.org/10.1162/opmi...
#AI #LLM #RL
Preprint update, co-led with @akjagadish.bsky.social, with @marvinmathony.bsky.social, Tobias Ludwig and @ericschulz.bsky.social!
Curious about LLM interpretability and understanding ? We borrowed concepts from genetics to map language models, predict their capabilities, and even uncovered surprising insights about their training !
Come see my poster at #ICLR2025 3pm Hall 2B #505 !
If you are interested in this line of research of mapping LLMs you might also want to check the amazing work of Eliahu Horwitz arxiv.org/abs/2503.10633 and Momose Oyama arxiv.org/abs/2502.16173 10/10
In short, PhyloLM is a cheap and versatile algorithm that generates useful representations for LLMs that can have creative applications in pratice. 9/10
paper : arxiv.org/abs/2404.04671
colab : colab.research.google.com/drive/1agNE5...
code : github.com/Nicolas-Yax/...
ICLR : Saturday 3pm Poster 505
A PhyloLM collaborative Huggingface space is available to try the algorithm and visualize maps : huggingface.co/spaces/nyax/... The Model Submit button has been temporarily suspended for technical reasons but it should be back very soon ! 8/10
By using code related contexts we can obtain a fairly different map. For example we notice that Qwen and GPT-3.5 have a very different way of coding compared to the other models which was not visible on the reasoning map. 7/10
The contexts choice is important as it reflects different capabilities of LLMs. Here on a general reasoning type of context we can plot a map of models using UMAP. The larger the edge, the closer models are from each other. Models on the same cluster are even closer ! 6/10
It can also measure quantization efficiency by observing the behavioral distance between LLM and quantized versions. In the Qwen 1.5 release, GPTQ seems to perform best. This new concept of metric could provide additional insights to quantization efficiency. 5/10
Aside from plotting trees, PhyloLM similarity matrix is very versatile. For example, running a logistic regression on the distance matrix makes it possible to predict performance of new models even from unseen families with good accuracy. Here is what we got on ARC. 4/10
Not taking into account these requirements can still produce efficient distance vizualisation trees. However it is important to remember they do not represent evolutionary trees. Feel free to zoom in to see model names. 3/10
Phylogenetic algorithms often require common ancestors to not appear in the objects studied but are clearly able to retrieve the evolution of the family. Here is an example in the richness of open-access model : @teknium.bsky.social @maximelabonne.bsky.social @mistralai.bsky.social 2/10
We build a distance matrix from comparing outputs of LLMs to a hundred of different contexts and build maps and trees from this distance matrix. Because PhyloLM only requires sampling very few tokens after a very short contexts the algorithm is particularly cheap to run. 1/10
π₯Our paper PhyloLM got accepted at ICLR 2025 !π₯
In this work we show how easy it can be to infer relationship between LLMs by constructing trees and to predict their performances and behavior at a very low cost with @stepalminteri.bsky.social and @pyoudeyer.bsky.social ! Here is a brief recap β¬οΈ
π Introducing π§MAGELLANβour new metacognitive framework for LLM agents! It predicts its own learning progress (LP) in vast natural language goal spaces, enabling efficient exploration of complex domains.πβ¨Learn more: π arxiv.org/abs/2502.07709 #OpenEndedLearning #LLM #RL
we are recruiting interns for a few projects with @pyoudeyer
in bordeaux
> studying llm-mediated cultural evolution with @nisioti_eleni
@Jeremy__Perez
> balancing exploration and exploitation with autotelic rl with @ClementRomac
details and links in π§΅
please share!
Putting some Flow Lenia here too
1/β‘οΈLooking for a fast and simple Transformer baseline for your RL environment in JAX ?
Sharing my implementation of transformerXL-PPO: github.com/Reytuag/tran...
The implementation is the first to attain the 3rd floor and obtain advanced achievements in the challenging Craftax
Related work on contamination in LLMs :
arxiv.org/abs/2402.15938 Dong et al. 2024
arxiv.org/abs/2310.15007 Meeus et al. 2023
arxiv.org/abs/2310.17623 Oren et al. 2024
LogProber paper : www.arxiv.org/abs/2408.14352
git : github.com/Nicolas-Yax/...
collab : colab.research.google.com/drive/1GDbmE...
It is part of a research agenda to open the LLM black box and provide tools for researchers to better interact with models in a more transparent manner. The last paper in this agenda was PhyloLM proposing methods to investigate the phylogeny of LLMs arxiv.org/abs/2404.04671 15/15
This method was first introduced in our paper Studying and improving reasoning in humans and machines investigating the evolution of cognitive biases in language models. www.nature.com/articles/s44... 14/15