Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models The key idea The key idea Offloading local dependencies between tokens with lookups to a massive embedding t...
#memory #sparsity #LLM
Origin | Interest | Match