arXiv cs.AI Artificial Intelligence's Avatar

arXiv cs.AI Artificial Intelligence

@csai-bot

Unofficial bot by @vele.bsky.social w/ http://github.com/so-okada/bXiv https://arxiv.org/list/cs.AI/new List https://bsky.app/profile/vele.bsky.social/lists/3lim7ccweqo2j ModList https://bsky.app/profile/vele.bsky.social/lists/3lim3qnexsw2g

197
Followers
1
Following
24,400
Posts
09.02.2025
Joined
Posts Following

Latest posts by arXiv cs.AI Artificial Intelligence @csai-bot

Yuan, Ghandeharioun, Blum, Machado, Hoffmann, Ippolito, Wattenberg, Dixon, Filippova: Think Before You Lie: How Reasoning Improves Honesty https://arxiv.org/abs/2603.09957 https://arxiv.org/pdf/2603.09957 https://arxiv.org/html/2603.09957

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Ronald Doku: The Confidence Gate Theorem: When Should Ranked Decision Systems Abstain? https://arxiv.org/abs/2603.09947 https://arxiv.org/pdf/2603.09947 https://arxiv.org/html/2603.09947

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Li, Liang, Li, Lyu, Qian, Chen, Wang, Zeng, Bharath, Liu: PathMem: Toward Cognition-Aligned Memory Transformation for Pathology MLLMs https://arxiv.org/abs/2603.09943 https://arxiv.org/pdf/2603.09943 https://arxiv.org/html/2603.09943

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Qian, Hu, Yu, Xin, Chen, Zhang, Jiang, Liu, Li: MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems https://arxiv.org/abs/2603.09909 https://arxiv.org/pdf/2603.09909 https://arxiv.org/html/2603.09909

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Hongbo Bo, Jingyu Hu, Weiru Liu: Influencing LLM Multi-Agent Dialogue via Policy-Parameterized Prompts https://arxiv.org/abs/2603.09890 https://arxiv.org/pdf/2603.09890 https://arxiv.org/html/2603.09890

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Tung Tran, Danilo Vasconcellos Vargas, Khoat Than: LCA: Local Classifier Alignment for Continual Learning https://arxiv.org/abs/2603.09888 https://arxiv.org/pdf/2603.09888 https://arxiv.org/html/2603.09888

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Jonah Brown-Cohen, David Lindner, Rohin Shah: Quantifying the Necessity of Chain of Thought through Opaque Serial Depth https://arxiv.org/abs/2603.09786 https://arxiv.org/pdf/2603.09786 https://arxiv.org/html/2603.09786

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Shouwei Ruan, Bin Wang, Zhenyu Wu, Qihui Zhu, Yuxiang Zhang, Hang Su, Yubin Wang: World2Mind: Cognition Toolkit for Allocentric Spatial Reasoning in Foundation Models https://arxiv.org/abs/2603.09774 https://arxiv.org/pdf/2603.09774 https://arxiv.org/html/2603.09774

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Xiaoxing Wang, Ning Liao, Shikun Wei, Chen Tang, Feiyu Xiong: AutoAgent: Evolving Cognition and Elastic Memory Orchestration for Adaptive Agents https://arxiv.org/abs/2603.09716 https://arxiv.org/pdf/2603.09716 https://arxiv.org/html/2603.09716

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Peng Sun, Huawen Shen, Yi Ban, Tianfan Fu, Yanbo Wang, Yuqiang Li: Does the Question Really Matter? Training-Free Data Selection for Vision-Language SFT https://arxiv.org/abs/2603.09715 https://arxiv.org/pdf/2603.09715 https://arxiv.org/html/2603.09715

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Ming Wen, Kun Yang, Jingyu Zhang, Yuxuan Liu, shiwen cui, Shouling Ji, Xingjun Ma: OOD-MMSafe: Advancing MLLM Safety from Harmful Intent to Hidden Consequences https://arxiv.org/abs/2603.09706 https://arxiv.org/pdf/2603.09706 https://arxiv.org/html/2603.09706

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Aman Sharma, Paras Chopra: EsoLang-Bench: Evaluating Genuine Reasoning in Large Language Models via Esoteric Programming Languages https://arxiv.org/abs/2603.09678 https://arxiv.org/pdf/2603.09678 https://arxiv.org/html/2603.09678

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

An, Cai, Chen, Liu, Liu, Wang, Yang, Zhu, Chen, Hou, Li, Ren, Yang, Zhang, Xu, Qu: Logics-Parsing-Omni Technical Report https://arxiv.org/abs/2603.09677 https://arxiv.org/pdf/2603.09677 https://arxiv.org/html/2603.09677

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Zuhao Zhang, Chengyue Yu, Yuante Li, Chenyi Zhuang, Linjian Mo, Shuai Li: MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants https://arxiv.org/abs/2603.09652 https://arxiv.org/pdf/2603.09652 https://arxiv.org/html/2603.09652

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Arash Shahmansoori: PRECEPT: Planning Resilience via Experience, Context Engineering & Probing Trajectories A Unified Framework for Test-Time Adaptation with Compositional Rule Learning and P... https://arxiv.org/abs/2603.09641 https://arxiv.org/pdf/2603.09641 https://arxiv.org/html/2603.09641

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Vera V. Vishnyakova: Context Engineering: From Prompts to Corporate Multi-Agent Architecture https://arxiv.org/abs/2603.09619 https://arxiv.org/pdf/2603.09619 https://arxiv.org/html/2603.09619

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Pietro Dell'Oglio, Alessandro Bondielli, Francesco Marcelloni, Lucia C. Passaro: Enhancing Debunking Effectiveness through LLM-based Personality Adaptation https://arxiv.org/abs/2603.09533 https://arxiv.org/pdf/2603.09533 https://arxiv.org/html/2603.09533

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Ilya Levin: Vibe-Creation: The Epistemology of Human-AI Emergent Cognition https://arxiv.org/abs/2603.09486 https://arxiv.org/pdf/2603.09486 https://arxiv.org/html/2603.09486

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Andrew Murray, Danial Dervovic, Alberto Pozanco, Michael Cashmore: GenePlan: Evolving Better Generalized PDDL Plans using Large Language Models https://arxiv.org/abs/2603.09481 https://arxiv.org/pdf/2603.09481 https://arxiv.org/html/2603.09481

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Zhuoran Deng, Yizhi Zhang, Ziyi Zhang, Wan Shen: Telogenesis: Goal Is All U Need https://arxiv.org/abs/2603.09476 https://arxiv.org/pdf/2603.09476 https://arxiv.org/html/2603.09476

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Yuan Cao, Dezhi Ran, Yuzhe Guo, Mengzhou Wu, Simin Chen, Linyi Li, Wei Yang, Tao Xie: An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse https://arxiv.org/abs/2603.09463 https://arxiv.org/pdf/2603.09463 https://arxiv.org/html/2603.09463

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Athanasios Davvetas, Michael Papademas, Xenia Ziouvelou, Vangelis Karkaletsis: AI Act Evaluation Benchmark: An Open, Transparent, and Reproducible Evaluation Dataset for NLP and RAG Systems https://arxiv.org/abs/2603.09435 https://arxiv.org/pdf/2603.09435 https://arxiv.org/html/2603.09435

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Hongqiang Lin, Zhenghui Fu, Weihao Tang, Pengfei Wang, Yiding Sun, Qixian Huang, Dongxu Zhang: Robust Regularized Policy Iteration under Transition Uncertainty https://arxiv.org/abs/2603.09344 https://arxiv.org/pdf/2603.09344 https://arxiv.org/html/2603.09344

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Shivam Raval, Hae Jin Song, Linlin Wu, Abir Harrasse, Jeff Phillips, Amirali Abdullah: Curveball Steering: The Right Direction To Steer Isn't Always Linear https://arxiv.org/abs/2603.09313 https://arxiv.org/pdf/2603.09313 https://arxiv.org/html/2603.09313

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Yuyang Dai: Rescaling Confidence: What Scale Design Reveals About LLM Metacognition https://arxiv.org/abs/2603.09309 https://arxiv.org/pdf/2603.09309 https://arxiv.org/html/2603.09309

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Haibin Wen, Zhe Zhao, Fanfu Wang, Tianyi Xu, Hao Zhang, Chao Yang, Ye Wei: Logos: An evolvable reasoning engine for rational molecular design https://arxiv.org/abs/2603.09268 https://arxiv.org/pdf/2603.09268 https://arxiv.org/html/2603.09268

11.03.2026 06:29 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Jincenzi Wu, Yuxuan Lei, Jianxun Lian, Yitian Huang, Lexin Zhou, Haotian Li, Xing Xie, Helen Meng: Social-R1: Towards Human-like Social Reasoning in LLMs https://arxiv.org/abs/2603.09249 https://arxiv.org/pdf/2603.09249 https://arxiv.org/html/2603.09249

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Linghu, Wang, Fan, Shi, Yin, Xue, Yang, Ren, Zhang: Cognitively Layered Data Synthesis for Domain Adaptation of LLMs to Space Situational Awareness https://arxiv.org/abs/2603.09231 https://arxiv.org/pdf/2603.09231 https://arxiv.org/html/2603.09231

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Silva, Denipitiyage, Mahanti, Seneviratne, Seneviratne: PrivPRISM: Automatically Detecting Discrepancies Between Google Play Data Safety Declarations and Developer Privacy Policies https://arxiv.org/abs/2603.09214 https://arxiv.org/pdf/2603.09214 https://arxiv.org/html/2603.09214

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Xupeng Chen: Abundant Intelligence and Deficient Demand: A Macro-Financial Stress Test of Rapid AI Adoption https://arxiv.org/abs/2603.09209 https://arxiv.org/pdf/2603.09209 https://arxiv.org/html/2603.09209

11.03.2026 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0