Your data is low-rank, so stop wasting compute! In our new paper on low-rank thinning, we share one weird trick to speed up Transformer inference, SGD training, and hypothesis testing at scale. Come by ICML poster W-1012 Tuesday at 4:30!
Your data is low-rank, so stop wasting compute! In our new paper on low-rank thinning, we share one weird trick to speed up Transformer inference, SGD training, and hypothesis testing at scale. Come by ICML poster W-1012 Tuesday at 4:30!
So you want to skip our thinning proofsβbut youβd still like our out-of-the-box attention speedups? Iβll be presenting the Thinformer at two ICML workshop posters tomorrow!
Catch me at Es-FoMo (1-2:30, East hall A) and at LCFM (10:45-11:30 & 3:30-4:30, West 202-204)
new working paper! we (me, Su Lin Blodgett, @ninamarkl.bsky.social) examine how recent marketing of LLMs extends older discourses that cast workers as bundles of skills, and unpack the false promises of empowerment these discourses embed, in times of precarity
tisjune.github.io/papers/aarhu...
Looking forward to this year's edition! With great speakers: Ryan McDonald Yulan He @vn-ml.bsky.social @antonisa.bsky.social Raquel Fernandez @annarogers.bsky.social Preslav Nakov @mohitbansal.bsky.social @eunsol.bsky.social Marie-Catherine de Marnefffe !
my lab (lacns.github.io) at @mpi-nl.bsky.social and @dondersinst.bsky.social is recruiting for two PhD and two postdoctoral positions funded by an @erc.europa.eu Consolidator - come join us!
PhD: www.mpi.nl/career-educa...
Postdoc: www.mpi.nl/career-educa...
(please share widely)
Does the "agreement" part refer only the previous question or to something else, and does the answer there have any consequences in the review process (can we review regardless of option? can we submit papers regardless of option?)
thanks in advance!
The "attribution" section only has an option "yes", signaling agreement to deanonymize your reviews. Is there an option to say no? (eg. by not selecting anything?) This is not communicated and is different from most other entries in the form.
hi, i'm struggling with the author registration form. i can't work out how to navigate the dark design patterns used in the "attribution" and "agreement" part of the form.
could you please provide some details about those choices?
most imp., are there choices that result in desk reject?
Schematic illustration of a scalar-valued residual deep GP with L hidden layers. The last layer is a scalar-valued GP on the manifold. If it is not present, the model is manifold-valued. If it is replaced with a Gaussian vector field (GVF), the model is a vector field on the manifold.
Excited to share our ICLR 2025 oral "Residual Deep Gaussian Processes on Manifolds"!
With @vabor112.bsky.social & @arkrause.bsky.social, we introduce manifold-to-manifold GPs that can be composed together, generalising deep GPs to manifolds. Applications include wind prediction & Bayes opt! 1/n
i can't believe how long we've spent fooling ourselves about the value of fully specified, massive matmuls instead of embracing the gods of sparsity
Recruiting a PhD candidate at U. of Amsterdam (funded, 4yr). We will use ML&NLP, prob. models, and user studies, to make adaptive scientific-assistant systems that communicate & justify decisions in ways helpful to experts.
More: vene.ro/jobs.html
Apply by May 18: werkenbij.uva.nl/en/vacancies...
Variational approximation with Gaussian mixtures is looking cute! So here it's just gradient descent on K(q||p) for optimising the mixtures means & covariances & weights...
@lacerbi.bsky.social
This review paper by @guillaume-garrigos.com on SGD-related algorithms is a fantastic resource, offering elegant, self-contained, and concise proofs in a single, accessible reference. arxiv.org/pdf/2301.11235
These phenomenon have been observed since early vision systems. It is important to report these things, though. Maybe it will permeate and we wonβt keep making the same mistakes over and over
This is such a beautiful algorithm (and a nice analysis): to check if an array is sorted vs. far from being sorted (many entries need to be changed), just:
- pick an element uniformly at random in the array
- "forget" where it was
- try to find it again via binary search
Repeat this a few times.
I and hundreds other workers at the University of Amsterdam are on strike with @fnv.bsky.social
www.linkedin.com/pulse/our-ha...
"AI can be bad but also it can be good" is just a really dumb way to talk about anything...it's the grade-school exercise of "make a list of pros and cons" but pressed into service for producing a sense of inevitability and making the medicine go down
OpenAI in 2024:
βNo AI for weapons or militaryβ
βDo use our AI to make weapons to hurt yourself or othersβ
βMilitary is fine, but no AI for weaponsβ
βSure put it on battlefield dronesβ
www.technologyreview.com/2024/12/04/1...
Blue skies π¦ , hot (?) takes π₯
Constrained output for LLMs, e.g., outlines library for vllm which forces models to output json/pydantic schemas, is cool!
But, because output tokens cost much more latency than input tokens, if speed matters: bespoke, low-token output formats are often better.
I hope I am not late to the party (was away post-quals chilling) but here are some thoughts on why this is bad IMO:
First, a disclaimer that I am writing this as an African who is a speaker of multiple African languages, NLP researcher of African languages, and HCI researcher focusing broadly on..