Thomas Frick's Avatar

Thomas Frick

@thomfrick

Computer Vision @ IBM Research Zurich PhD ETH Zurich

646
Followers
627
Following
6
Posts
21.11.2024
Joined
Posts Following

Latest posts by Thomas Frick @thomfrick

Preview
SPARC: Separating Perception And Reasoning Circuits for Test-time Scaling of VLMs Despite recent successes, test-time scaling - i.e., dynamically expanding the token budget during inference as needed - remains brittle for vision-language models (VLMs): unstructured chains-of-though...

Check out our work here:
arxiv.org/abs/2602.06566

@niccoloav.bsky.social @matrig.net

17.02.2026 08:58 ๐Ÿ‘ 0 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

This separation unlocks powerful capabilities:
โœจ Scale "looking" independently of "thinking"
โœจ Keep contexts lean โ€” only process relevant crops
โœจ Train the "eyes" without retraining the "brain" on a single GPU

17.02.2026 08:58 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Our method SPARC explicitly decouples the "Where" (perception) from the "Why" (reasoning) โ€” mimicking how the brain separates early visual processing from executive function.

๐Ÿ” First: aggressive visual search to find the right pixels
๐Ÿง  Then: focused reasoning on only the relevant crops

17.02.2026 08:58 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Even the most brilliant detective can't solve a case without finding the clues first.

Yet most "thinking" VLMs make this exact mistake: they entangle visual search and complex logic into one giant, expensive chain of thought.

Stop burning through tokens in the dark. Ignite a SPARC. โšก

17.02.2026 08:58 ๐Ÿ‘ 4 ๐Ÿ” 1 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Thanks! Love to be on the list!

21.11.2024 23:20 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

๐Ÿ‘‹๐Ÿป

21.11.2024 22:26 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0