come learn about LLM geometry!
come learn about LLM geometry!
I'll present this poster tonight at East exhibit hall a-c 2510. 5-7:30 pm.
Come chat about alignment!
I'll be at NeurIPS Thursday-Sunday; send me an email if you'd like to chat :)
LLM Alignment aims at making model outputs preferred by a ranker while changing as little 'off-target' behavior as possible.
Turns out:
-best-of-$n$ is the optimal option!
-you can contrastively train an LLM to mimic its own best-of-$n$ distribution!
BonBon alignment: arxiv.org/abs/2406.00832