Today is the International Day for the Elimination of Violence against Women. According to the UN, more than 50 000 women were killed by a partner or family member in 2023 news.un.org/en/story/202... This number is an underestimate given that only 37 countries reported in 2023.
26.11.2024 06:16
π 8
π 3
π¬ 0
π 0
Yeah I suspected that, interesting!
25.11.2024 13:36
π 0
π 0
π¬ 0
π 0
distributed learning for LLM?
recently, @primeintellect.bsky.social have announced finishing their 10B distributed learning, trained across the world.
what is it exactly?
π§΅
25.11.2024 12:02
π 23
π 6
π¬ 1
π 2
Instead of averaging outer gradients would fancier model merging techniques (eg TIES) apply here?
25.11.2024 12:50
π 1
π 0
π¬ 1
π 0
having better tools for reviewer and ac assignment would definitely help, ultimately reducing # reviews per paper while striving for relevance and quality could increase reviewer / ac engagement and free up their time to do better reviews
24.11.2024 22:23
π 2
π 0
π¬ 0
π 0
A sparse mask of attention scores based on VerticalAndSlashAttention and a plot of loss vs sparsity ratio for various methods.
Another nano gem from my amazing student
Piotr Nawrot!
A repo & notebook on sparse attention for efficient LLM inference: github.com/PiotrNawrot/...
This will also feature in my #NeurIPS 2024 tutorial "Dynamic Sparsity in ML" with AndrΓ© Martins: dynamic-sparsity.github.io Stay tuned!
20.11.2024 12:51
π 42
π 8
π¬ 2
π 3
Explore zero-shot routing of parameter-efficient experts with Phatgoose arxiv.org/abs/2402.05859 and Arrow arxiv.org/abs/2405.11157 w. github.com/microsoft/mttl
π github.com/sordonia/pg_mbβ¦
Part of "Dynamic Sparsity in ML" tuto #neurips2024, feedback welcome and join for discussions! π
21.11.2024 15:47
π 5
π 1
π¬ 0
π 0