Rohit Saxena's Avatar

Rohit Saxena

@rohit-saxena

PhD student at University of Edinburgh Long Context | Summarization | Vision and Language | Narratives https://saxenarohit.github.io/

305
Followers
360
Following
9
Posts
20.11.2024
Joined
Posts Following

Latest posts by Rohit Saxena @rohit-saxena

Congrats! Looks like time is a big failure case for these models (cc @neuralnoise.com @aryopg.bsky.social @rohit-saxena.bsky.social )
bsky.app/profile/emil...

17.05.2025 07:07 πŸ‘ 3 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0

Work done with @neuralnoise.com Frank Keller

10.03.2025 14:19 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

We tested state-of-the-art multimodal LLMs on this challenging taskβ€”and they struggled! πŸ€–πŸ“‰

We also propose a new method:
πŸ”₯SEGMENT & SUMMARIZE, a training-free approach that outperforms existing models by:
πŸ”Ή Segmenting the poster into logical regions
πŸ”Ή Performing local & global summarization

10.03.2025 14:19 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸ“Š PosterSum features 16,305 poster-abstract pairs from major ML conferences.

Task: Summarize a research poster image into a concise abstract summary.

10.03.2025 14:19 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Can multimodal LLMs truly understand research poster images?πŸ“Š

πŸš€ We introduce PosterSumβ€”a new multimodal benchmark for scientific poster summarization!

πŸ“‚ Dataset: huggingface.co/datasets/rohitsaxena/PosterSum
πŸ“œ Paper: arxiv.org/abs/2502.17540

10.03.2025 14:19 πŸ‘ 8 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0

πŸ™‹β€β™‚οΈ

20.11.2024 17:19 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I'd love to be added!
Thanks

20.11.2024 12:15 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Would love to be added!

20.11.2024 12:08 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Hello, can you please add me? Thanks

20.11.2024 11:59 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I'd love to be added!
Thanks

20.11.2024 11:48 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0