LEO-VL: Efficient 3D Scene Representation for Vision‑Language AI
LEO‑VL, a new vision‑language model, cuts 3D scene token counts and was trained on roughly 700,000 indoor examples, achieving state‑of‑the‑art results on SQA3D, MSQA and Beacon3D. getnews.me/leo-vl-efficient-3d-scen... #leovl #3dvision #aiexamples