Spyros Gidaris's Avatar

Spyros Gidaris

@spyrosgidaris

Senior Research Scientist at Valeo.ai (@valeoai.bsky.social) https://gidariss.github.io/

148
Followers
131
Following
5
Posts
24.11.2024
Joined
Posts Following

Latest posts by Spyros Gidaris @spyrosgidaris

2025 ICCV Program Committee

Congratulations to our lab colleagues who have been named Outstanding Reviewers at #ICCV2025 ๐Ÿ‘

Andrei Bursuc @abursuc.bsky.social
Anh-Quan Cao @anhquancao.bsky.social
Renaud Marlet
Eloi Zablocki @eloizablocki.bsky.social

@iccv.bsky.social
iccv.thecvf.com/Conferences/...

02.10.2025 15:28 ๐Ÿ‘ 20 ๐Ÿ” 6 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 1

Update: ResearchGate has investigated the case, and, as far as I can see, all the suspicious papers (~200) have now been removed. Many thanks to the @researchgate.bsky.social team!

24.09.2025 11:35 ๐Ÿ‘ 4 ๐Ÿ” 3 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Preview
Anastasios Gerontopoulos (@nasosger.bsky.social) 1/n Multi-token prediction training boosts LLMs (DeepSeek-V3), tackling key limitations of the next-token prediction objective: โ€ข Short-term focus โ€ข Struggles with long-range decisions โ€ข Weaker supervision Prior methods add complexity (extra layers) ๐Ÿ”‘ Our fix? Register tokensโ€”elegant and powerful

3. MuToR with @nasosger.bsky.social & Nikos Komodakis
Multi-token prediction with registers
๐Ÿ”— Paper: arxiv.org/abs/2505.10518
๐Ÿฆ Post: bsky.app/profile/nas...

23.09.2025 08:11 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Preview
Thodoris Kouzelis (@nicolabourbaki.bsky.social) 1/n Introducing ReDi (Representation Diffusion): a new generative approach that leverages a diffusion model to jointly capture โ€“ Low-level image details (via VAE latents) โ€“ High-level semantic features (via DINOv2)๐Ÿงต

2. ReDi (spotlight) with @nicolabourbaki.bsky.social,, @ikakogeorgiou.bsky.social & Nikos Komodakis
Boosting generative image modeling via joint image-feature synthesis
๐Ÿ”— Paper: arxiv.org/abs/2504.16064
๐Ÿฆ Post: bsky.app/profile/nic...

23.09.2025 08:11 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Preview
@sta8is.bsky.social 1/n ๐Ÿš€ Excited to share our latest work: DINO-Foresight, a new framework for predicting the future states of scenes using Vision Foundation Model features! Links to the arXiv and Github ๐Ÿ‘‡

1. DINO-Foresight with sta8is.bsky.social, @ikakogeorgiou.bsky.social & Nikos Komodakis
Future state prediction using vision foundation model features
๐Ÿ”— Paper: arxiv.org/abs/2412.11673
๐Ÿฆ Post: bsky.app/profile/sta...

23.09.2025 08:11 ๐Ÿ‘ 4 ๐Ÿ” 0 ๐Ÿ’ฌ 2 ๐Ÿ“Œ 0

Three papers accepted to #NeurIPS2025 (one spotlight)! ๐ŸŽ‰

Awesome works in generative modeling, multi-token prediction, and future prediction.

Congratulations to all collaborators!
@nasosger.bsky.social, sta8is.bsky.social, @nicolabourbaki.bsky.social, @ikakogeorgiou.bsky.social & N. Komodakis!

23.09.2025 08:11 ๐Ÿ‘ 10 ๐Ÿ” 1 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Discovered that our RangeViT paper keeps being cited in what might be LLM-generated papers. Number of citations increased rapidly in the last weeks. Too good to be true.

Papers popped up on different platforms, but mainly on ResearchGate with ~80 papers in just 3 weeks.
[1/]

16.09.2025 10:20 ๐Ÿ‘ 6 ๐Ÿ” 5 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 2
Post image

1/ Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research.

21.07.2025 14:47 ๐Ÿ‘ 83 ๐Ÿ” 21 ๐Ÿ’ฌ 2 ๐Ÿ“Œ 3
Post image

1/ New & old work on self-supervised representation learning (SSL) with ViTs:
MOCA โ˜• - Predicting Masked Online Codebook Assignments w/ @spyrosgidaris.bsky.social O. Simeoni, A. Vobecky, @matthieucord.bsky.social, N. Komodakis, @ptrkprz.bsky.social #TMLR #ICLR2025
Grab a โ˜• & brace for a story & a๐Ÿงต

27.06.2025 07:33 ๐Ÿ‘ 22 ๐Ÿ” 3 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 2
Post image

1/n ๐Ÿš€New paper out - accepted at #ICCV2025!

Introducing DIP: unsupervised post-training that enhances dense features in pretrained ViTs for dense in-context scene understanding

Below: Low-shot in-context semantic segmentation examples. DIP features outperform DINOv2!

25.06.2025 19:21 ๐Ÿ‘ 21 ๐Ÿ” 6 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 4
Video thumbnail

๐Ÿš€Thrilled to introduce JAFARโ€”a lightweight, flexible, plug-and-play module that upsamples features from any Foundation Vision Encoder to any desired output resolution (1/n)

Paper : arxiv.org/abs/2506.11136
Project Page: jafar-upsampler.github.io
Github: github.com/PaulCouairon...

16.06.2025 13:58 ๐Ÿ‘ 26 ๐Ÿ” 6 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Are you at @cvprconference.bsky.social? Come by our poster!
๐Ÿ“… Sat 14/6, 10:30-12:30
๐Ÿ“ Poster #395, ExHall D

13.06.2025 05:09 ๐Ÿ‘ 16 ๐Ÿ” 9 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

I am at #CVPR2025 this week in Nashville!

Presenting "Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers" on multi-modal semantic future prediction.

Come discuss!

Fri 13 Jun 10:30-12:30, poster #345
bsky.app/profile/sta8...

13.06.2025 12:39 ๐Ÿ‘ 6 ๐Ÿ” 2 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

1/n Introducing ReDi (Representation Diffusion): a new generative approach that leverages a diffusion model to jointly capture
โ€“ Low-level image details (via VAE latents)
โ€“ High-level semantic features (via DINOv2)๐Ÿงต

25.04.2025 07:23 ๐Ÿ‘ 21 ๐Ÿ” 3 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1

The @valeoai.bsky.social team is presenting a few exciting works @iclr-conf.bsky.social this year on masked generative transformers, adaptation of VLMs, self-supervised representation learning, neural solvers. #iclr2025
Check them out ๐Ÿ‘‡

09.04.2025 12:48 ๐Ÿ‘ 8 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image Post image Post image

Nice research work from @nicolabourbaki.bsky.social et al. Enhances latent generative models by regularizing the VAE's latent space with an equivariance loss. The finetuning process is straightforward + demonstrates improvements in just 5 epochs!

๐Ÿ“„ arxiv.org/abs/2502.09509
๐Ÿ github.com/zelaki/eqvae

25.02.2025 19:57 ๐Ÿ‘ 9 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Still mesmerized by this work and its results: a mid-to-end driving agent trained with self-play on just 8 maps on 1.6B km of driving (9500 years of subjective driving experience) smashes in off-the-shelf manner all existing benchmarks (nuPlan, CARLA, Waymax) ๐Ÿ˜ฎ

21.02.2025 22:28 ๐Ÿ‘ 6 ๐Ÿ” 4 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

EQ-VAE: Such a simple & cool trick to regularize multiple kinds of autoencoders: align reconstruction of transformed latents w/ the corresponding transformed inputs.
๐Ÿš€REPA: 4x training speedup
๐Ÿš€MaskGIT: 2x training speedup
๐Ÿš€DiT-XL/2: 7x faster convergence

Kudos @nicolabourbaki.bsky.social et al.

21.02.2025 22:54 ๐Ÿ‘ 9 ๐Ÿ” 2 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

The things I've found hardest about research have all been non-technical: maintaining confidence and self-esteem, not abandoning the work when it's too hard or stressful, finding time to learn new things. In comparison, the technical parts are much easier

18.02.2025 18:27 ๐Ÿ‘ 60 ๐Ÿ” 10 ๐Ÿ’ฌ 5 ๐Ÿ“Œ 0
Post image Post image Post image

๐Ÿšจ Just a quick note that following requests, we trained a 512px version of our Coherence-Aware Diffusion model (CVPR'24) and updated the paper on arxiv: arxiv.org/abs/2405.20324

It has a package and pretrained models!

๐Ÿ–ฅ๏ธ nicolas-dufour.github.io/cad.html
๐Ÿค– github.com/nicolas-dufo...

20.02.2025 06:21 ๐Ÿ‘ 23 ๐Ÿ” 4 ๐Ÿ’ฌ 2 ๐Ÿ“Œ 2
Post image

1/n๐Ÿš€If youโ€™re working on generative image modeling, check out our latest work! We introduce EQ-VAE, a simple yet powerful regularization approach that makes latent representations equivariant to spatial transformations, leading to smoother latents and better generative models.๐Ÿ‘‡

18.02.2025 14:26 ๐Ÿ‘ 19 ๐Ÿ” 8 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1
Post image

1/n ๐Ÿš€ Excited to share our latest work: DINO-Foresight, a new framework for predicting the future states of scenes using Vision Foundation Model features!
Links to the arXiv and Github ๐Ÿ‘‡

07.02.2025 17:05 ๐Ÿ‘ 20 ๐Ÿ” 3 ๐Ÿ’ฌ 2 ๐Ÿ“Œ 1

This amazing team โค๏ธ

27.01.2025 17:01 ๐Ÿ‘ 19 ๐Ÿ” 3 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image Post image

Thrilled to announce our workshop on Embodied Intelligence for Autonomous Systems on the Horizon @cvprconference.bsky.social featuring a crazy line-up of speakers and challenges.
Mark it in your agendas and also in your registration #cvpr2025
opendrivelab.com/cvpr2025/wor...

23.01.2025 21:17 ๐Ÿ‘ 26 ๐Ÿ” 6 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0