Alessandro Sordoni (@murefil)

Today is the International Day for the Elimination of Violence against Women. According to the UN, more than 50 000 women were killed by a partner or family member in 2023 news.un.org/en/story/202... This number is an underestimate given that only 37 countries reported in 2023.

26.11.2024 06:16 👍 8 🔁 3 💬 0 📌 0

Yeah I suspected that, interesting!

25.11.2024 13:36 👍 0 🔁 0 💬 0 📌 0

distributed learning for LLM?

recently, @primeintellect.bsky.social have announced finishing their 10B distributed learning, trained across the world.

what is it exactly?

🧵

25.11.2024 12:02 👍 23 🔁 6 💬 1 📌 2

Instead of averaging outer gradients would fancier model merging techniques (eg TIES) apply here?

25.11.2024 12:50 👍 1 🔁 0 💬 1 📌 0

having better tools for reviewer and ac assignment would definitely help, ultimately reducing # reviews per paper while striving for relevance and quality could increase reviewer / ac engagement and free up their time to do better reviews

24.11.2024 22:23 👍 2 🔁 0 💬 0 📌 0

Informatics: ILCC: Language Processing, Speech Technology, Information Retrieval, Cognition Study Informatics: ILCC: Language Processing, Speech Technology, Information Retrieval, Cognition at the University of Edinburgh. Our postgraduate degree programmes focus on natural language processin...

Last 5 days to apply for a PhD at #EdinburghNLP!

Deadline: November 25

www.ed.ac.uk/studying/pos...

If you are passionate about:

- adaptive tokenization and memory in foundation models
- modular deep learning
- computational typology

please message me or meet me at #NeurIPS2024!

21.11.2024 13:41 👍 20 🔁 8 💬 0 📌 0

A sparse mask of attention scores based on VerticalAndSlashAttention and a plot of loss vs sparsity ratio for various methods.

Another nano gem from my amazing student
Piotr Nawrot!

A repo & notebook on sparse attention for efficient LLM inference: github.com/PiotrNawrot/...

This will also feature in my #NeurIPS 2024 tutorial "Dynamic Sparsity in ML" with André Martins: dynamic-sparsity.github.io Stay tuned!

20.11.2024 12:51 👍 42 🔁 8 💬 2 📌 3

Explore zero-shot routing of parameter-efficient experts with Phatgoose arxiv.org/abs/2402.05859 and Arrow arxiv.org/abs/2405.11157 w. github.com/microsoft/mttl

👉 github.com/sordonia/pg_mb…

Part of "Dynamic Sparsity in ML" tuto #neurips2024, feedback welcome and join for discussions! 😊

21.11.2024 15:47 👍 5 🔁 1 💬 0 📌 0

Alessandro Sordoni

Latest posts by Alessandro Sordoni @murefil