I would also recommend the read of this paper (arxiv.org/abs/2302.04763) from 2 years ago. The benchmark is very comparable with more intuitions in the multimodal case.
28.12.2024 13:41
👍 1
🔁 0
💬 0
📌 0
I would also recommend the read of this paper (arxiv.org/abs/2302.04763) from 2 years ago. The benchmark is very comparable with more intuitions in the multimodal case.