#DataContamination #AIEvaluation Training–test overlap can inflate LLM scores. “data contamination” in #LLMs, defined as unintended overlap between training data & evaluation data that can inflate measured performance & misrepresent true generalization. arxiv.org/html/2502.14...