Rethinking LLM Integration in Visual Speech Recognition
A Llama‑2‑13B model trained on combined LRS2+LRS3 reaches 24.7% word error rate (WER) on LRS3 and 47.0% on WildVSR; larger LLMs give only modest gains overall. Read more: getnews.me/rethinking-llm-integrati... #visualspeechrecognition #llm