Accepted Papers

Showing 0 / 0 papers

ViG-LLM: Enhancing Visual Grounding Capabilities in Closed-Box LLMs for Document Information Extraction without OCR Dependencies

Sudhanshu Bhoi

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Multi-Agent Tool-Integrated Policy Optimization

Zhanfeng Mo, Xingxuan Li, Yuntao Chen, Lidong Bing

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

MI9: An Integrated Runtime Governance Framework for Agentic AI

Charles L. Wang, Trisha Singhal, Ameya Kelkar, Jason Tuo

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Multi-Round Reinforcement Learning with Feedback Reflection for Large Language Models Post-Training

Rundong Wang, Xiaobo Yang, Zhanfeng Mo, Shihong Deng, Shuchang Zhou

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Peering Behind the Shield: Guardrail Identification in Large Language Models

Ziqing Yang, Yixin Wu, Rui Wen, Michael Backes, Yang Zhang

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Provably Reliable Tool-Using LLM Agents: Formal Guarantees on Error Accumulation in the Model Context Protocol (MCP)

Flint Xiaofeng Fan, Cheston Tan, Roger Wattenhofer, Yew-Soon Ong

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Trajectory Guard - A Lightweight, Sequence-Aware Model for Real-Time Anomaly Detection in Agentic AI

Laksh Advani

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Towards Trustworthy Multi-Turn LLM Agents via Behavioral Guidance

Gonca Gürsun

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Teach a Reward Model to Correct Itself: Reward Guided Adversarial Failure Discovery for Robust Reward Modeling

Pankayaraj Pathmanathan, Furong Huang

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Blue Teaming Function-Calling Agents

Greta Dolcetti, Giulio Zizzo, Sergio Maffeis

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

"Are We Done Yet?": A Vision-Based Judge for Autonomous Task Completion of Computer Use Agents

Marta Sumyk, Oleksandr Kosovan

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

A Collaborative Multi-Agent Framework for Jailbreaking with RL-Based Dynamic Prompting

Azka Ikramullah, Kyunghyun Lee, Abdul Majeed, Seong Oun Hwang

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

AEMA: Verifiable Evaluation Framework for Trustworthy and Controlled Agentic LLM Systems

Yen-Ting Lee, KEERTHI KONERU, Zahra Moslemi, Sheethal Kumar, Ramesh Radhakrishnan

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Beyond Success Rate: Benchmarking Robustness in Tool-Using Language Agents

Kristina Lewandowska

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Optimizing AI Agent Attacks With Synthetic Data

Chloe R Loughridge, Paul Colognese, Avery Griffin, Tyler Tracy, Jonathan Kutasov, Joe Benton

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Certifying LLM Agent Risks in Diverse Scenarios

Yuhao Zhang, Mintong Kang

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

PSM: Prompt Sensitivity Minimization via LLM-Guided Black-Box Optimization

Hussein Jawad, Nicolas J-B. Brunel

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Evaluating Control Protocols for Untrusted AI Agents

Jonathan Kutasov, Chloe R Loughridge, Yuqi Sun, Henry Sleight, Buck Shlegeris, Tyler Tracy, Joe Benton

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Real-Time Trust Verification for Safe Agentic Actions using TrustBench

Tavishi Sharma, Vinayak Sharma, Pragya Sharma

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Lightweight and Faithful Visual Condition Checking in Behavior Trees via Expert-Regularized Reinforcement Learning

Hyosik Moon, Eldan Cohen

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

The Trust Paradox: When Over-Alignment Reduces Human Trust in Agentic LLMs

Gokul Srinath Seetha Ram

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

ShieldBench: A Comprehensive Benchmark for Evaluating the Persistence of LLM Safety Interventions

Mert Ogul, Rishitha Voleti, Shanduojiao Jiang

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Improving Physics Reasoning in Large Language Models Using Mixture of Refinement Agents

Raj Jaiswal, Dhruv Jain, Harsh Parimal Popat, Abhishek Dharmadhikari, Atharva Marathe, Avinash Anand, Shin'ichi Satoh, Rajiv Ratn Shah

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

The Multi-Agent Off-Switch Game

Akash Agrawal, Soroush Ebadian, Lewis Hammond

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Building Real-time Awareness of Out-of-distribution in Trajectory Prediction for Autonomous Vehicles

Tongfei Guo, Taposh Banerjee, Lili Su

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Agent-SafetyBench: Evaluating the Safety of LLM Agents

Zhexin Zhang, Shiyao Cui, Yida Lu, Jingzhuo Zhou, Junxiao Yang, Hongning Wang, Minlie Huang

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Trustworthy AI in the Agentic Lakehouse: from Concurrency to Governance

Jacopo Tagliabue, Federico Bianchi, Ciro Greco

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Demonstrating specification gaming in reasoning models

Alexander Bondarenko, Denis Volk, Dmitrii Volkov, Jeffrey Ladish

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Proactive Interference Reveals Working Memory Limits in LLMs Beyond Context Length

Chupei Wang, Jiaqiu Vince Sun

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Federated Agent Reinforcement Learning

Canyu Chen, Kangyu Zhu, Zhaorun Chen, Zhanhui Zhou, Shizhe Diao, Yiping Lu, Tian Li, Manling Li, Dawn Song

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Can Linear Attributions Explain Nonlinear LLMs?

Vishal Pramanik, Maisha Maliha, Nathaniel D. Bastian, Sumit Kumar Jha

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Drift No More? Context Equilibria in Multi-Turn LLM Interactions

Vardhan Dongre, Ryan A. Rossi, Viet Dac Lai, Seunghyun Yoon, Dilek Hakkani-Tür, Trung Bui

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

LENS: Learning Architecture Navigator for LLM Agentic Systems

Guancheng Wan, Jiayi Yang, Mengting Li

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

RAAS: Relative Architecture Adaptive Search for Agentic Supernet Optimization

Jiayi Yang, Mengting Li, Guancheng Wan

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

JSPLIT: A Taxonomy-based Solution for Prompt Bloating in Model Context Protocol

Emanuele Antonioni, Stefan Markovic, Anirudha Shankar, Jaime Carreira Bernardo, Lovro Markovic, Silvia Pareti, Benedetto Proietti

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Reflection-Driven Control for Trustworthy Code Agents

Wang Bin, Quan Jiazheng, Xingrui Yu, Hu Hansen, YuHao, Ivor Tsang

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

TrustRAG: Enhancing Robustness and Trustworthiness in Retrieval-Augmented Generation

Huichi Zhou, KinHei Lee, Zhonghao Zhan, Zhenhao Li, Yue Chen, Huaxiu Yao, Hamed Haddadi, Emine Yilmaz

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Empowering Small Language Models with Factual Hallucination-Aware Reasoning for Financial Classification

Han Yuan, Yilin Wu, Li Zhang, Zheng Ma

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

IntentGuard: Safeguard LLM Agents via Intent Alignment

Jingyue Cong, XinyuanQiao, Yulin Dong, Yueheng Huang, Yang Yu, Estrid He, Andy Song

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Divergence or Fusion? CN and US LLMs Value Comparison in An AI-Oriented Measurement Framework

Yang Ma, Song Tong, Bo Wang, KAIPING PENG

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities

Vanshaj Khattar, Moumita Choudhury, Md Rafi Ur Rashid, Jing Liu, Toshiaki Koike-Akino, Ming Jin, Ye Wang

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Stochasticity in Agentic Evaluations: Quantifying Inconsistency with Intraclass Correlation

Zairah Mustahsan, Abel Lim, Megna Anand, Saahil Jain, Bryan McCann

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Internal Representations as Indicators of Hallucinations in Agent Tool Selection

Kait Healy, Bharathi Srinivasan, Visakh Madathil, Jing Wu

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

When Refusals Fail: Unstable Safety Mechanisms in Long-Context LLM Agents

Tsimur Hadeliya, Mohammad Ali Jauhar, Nidhi Sakpal, Diogo Cruz

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

DrawingBench: Evaluating Spatial Reasoning and UI Interaction Capabilities of Large Language Models through Mouse-Based Drawing Tasks

Hyunjun Kim, Sooyoung Ryu

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

CertiHealth: Towards Certified, Uncertainty-Aware, and Explainable AI for Medical Decision-Making

Bisiriyu Maqsudhat Kofoworola

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

When Small Models Are Right for Wrong Reasons: Process Verification for Trustworthy Agents

Laksh Advani

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

MermaidFlow-CF: How Agentic Workflow Representation Governs Constraint-Faithful Control

Keyi Xiang, Boyuan Shi, Yueming Lyu, Jianda Chen, Ivor Tsang, Haiyan Yin

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Catching Contamination Before Generation: Spectral Kill Switches for Agents

Valentin NO_L

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

COMPASS: Context-Modulated PID Attention Steering System for Hallucination Mitigation

Kenji Sahay, Snigdha Pandya, Rohan Nagale, Anna Lin, Shikhar Shiromani, Kevin Zhu, Sunishchal Dev

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

UpstreamQA: A Modular Framework for Explicit Reasoning on Video Question Answering Tasks

Jason Nguyen, Ameet Rao, Alexander Chang, Ishaan Kumar, Erin Tan

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Intent-Governed Loops for Accountable Agentic AI

Christoforus Yoga Haryanto

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

PrivAgentFlow: Agentic Workflow for Distributed Privacy Control in Web Agents

Tianyi Ma, Tianyi Tang, Yueming Lyu, Haiyan Yin, Yew-Soon Ong, Ivor Tsang

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

DASH: Dialogue-Aware Similarity and Handshake Recognition for Topic Segmentation in Public-Channel Conversations

SIJIN SUN, Liangbin Zhao, Ming Deng, Xiuju Fu

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Towards Design of an Automated Judge for Multi-Agent Systems

Alina Borisovna Zhidkovskaya, Kirill Rapatskikh, Jerzy Kami_ski, Barakhsin Grigorii Mikhailovich, Anna Kalyuzhnaya, Druzhinin Alexey Alekseevich, Andrey Savchenko, Julia Belikova, Konstantin Polev, Nikolay Nikitin

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

A Database-Independent LLM Framework for Real-Time Authorization in Retrieval-Augmented Generation

Halil Yesil, Sümeyye Gültekin, Fatma Bozyigit, Baris Tekin Tezel, MOHARRAM CHALLENGER

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Verifiability-First Agents: Provable Observability and Lightweight Audit Agents for Controlling Autonomous LLM Systems

Abhivansh Gupta

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Emergent Collusion in LLM-Powered Multi-Agent Markets: A Comprehensive Survey of Risks, Mechanisms, Governance, and Regulatory Challenges

Mohammad Sajjad Ghaemi

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Verifiable Control and Calibrated Trust in Embodied Neuromorphic Agents for Safety-Critical Applications

Sylvester Kaczmarek

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Lattice: Generative Guardrails for Conversational Agents

Emily Broadhurst, Tawab Safi, Joseph Edell, Vashisht Anand Ganesh, Karime Maamari

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

AgentTrace: A Structured Logging Framework for Agent System Observability

Adam AlSayyad, Kelvin Yuxiang Huang, Richik Pal

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Spectral Guarantees for Policy Drift in Self-Refining LLM Agents

Murari Ambati

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

SCOUT-RAG: Scalable and Cost-Efficient Unifying Traversal for Agentic Graph-RAG over Distributed Domains

Longkun Li, Zou Yuanben, Jinghan Wu, Yuqing Wen, Jing Li, Hangwei Qian, Ivor Tsang

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Optimizing Importance Sampling Methods for Rare Output Estimation in Language Models

Amanda Cao, Ivan Betancourt, Manish Rangan, Yuqi Sun

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

From Biased Chatbots to Biased Agents: Examining Role Assignment Effects on LLM Agent Robustness

Linbo Cao, Lihao Sun, Yang Yue

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

SALT: Steering Activations towards Leakage-free Thinking in Chain of Thought

Shourya Batra, Pierce Tillman, Samarth Gaggar, Shashank Kesineni, Sunishchal Dev, Kevin Zhu, Ashwinee Panda, Maheep Chaudhary

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Factor($U$,$T$): Controlling Untrusted AI by Monitoring their Plans

Edward Lue Chee Lip, Anthony Channg, Diana Kim, Aaron Sandoval, Kevin Zhu

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Mind the Gap to Trustworthy LLM Agents: A Systematic Evaluation on Constraint Satisfaction for Real-World Travel Planning

Bo-Wen Zhang, Jin Ye, Jie-Jing Shao, Yu-Feng Li, Lan-Zhe Guo

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details

Tri-Bench: Stress-Testing VLM Reliability on Spatial Reasoning under Camera Tilt and Object Interference

Amit Bendkhale

Published: 27 Jan 2026 · Trustworthy Agentic AI @ AAAI 2026

Show details