Trending

#tokenizers

Latest posts tagged with #tokenizers on Bluesky

Latest Top
Trending

Posts tagged #tokenizers

Discrete Audio Tokens: More Than a Survey!

Pooneh Mousavi, Gallil Maimon, Adel Moumen et al.

Action editor: Tatsuya Harada

https://openreview.net/forum?id=eqNchtvc6v

#tokenizers #tokenizer #tokenization

0 0 0 0

New #J2C Certification:

Discrete Audio Tokens: More Than a Survey!

Pooneh Mousavi, Gallil Maimon, Adel Moumen et al.

https://openreview.net/forum?id=eqNchtvc6v

#tokenizers #tokenizer #tokenization

0 0 0 0
Strategies for very fast Lexers Making compilation pipelines fast, starting with the tokenizer

Great 👌🏽:

“Strategies For Very Fast Lexers”, Matteo / ‘xnacly’ (xnacly.me/posts/2025/f...).

Via HN: news.ycombinator.com/item?id=4456...

On Lobsters: lobste.rs/s/75zw2o/str...

#Compilers #Lexers #Tokenizers #LexicalAnalyzers #Speed #C #Programming #Efficiency #Optimization #PLDI

0 0 1 0

Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs

Nicolas Boizard, Kevin El Haddad, CELINE HUDELOT, Pierre Colombo

Action editor: Frederic Sala

https://openreview.net/forum?id=bwRxXiGO9A

#tokenizers #tokenizer #distillation

0 0 0 0
Implementing A Byte Pair Encoding (BPE) Tokenizer From Scratch This is a standalone notebook implementing the popular byte pair encoding (BPE) tokenization algorithm, which is used in models like GPT-2 to GPT-4, Llama 3,...

#llms #tokenizers #nlp #python

1 0 0 0

Training LLMs over Neurally Compressed Text

Brian Lester, Jaehoon Lee, Alexander A Alemi et al.

Action editor: Robert Gower

https://openreview.net/forum?id=pRvhMSV48t

#compression #compressed #tokenizers

1 0 0 0

New #Featured Certification:

Training LLMs over Neurally Compressed Text

Brian Lester, Jaehoon Lee, Alexander A Alemi et al.

https://openreview.net/forum?id=pRvhMSV48t

#compression #compressed #tokenizers

1 0 0 0