Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration (Co-Gym) |
2025/01 |
Link |
Corrective |
Refinement |
Segment |
During Task |
Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task |
2024/09 |
– |
Corrective |
Refinement |
Segment |
During Task |
FinArena: A Human-Agent Collaboration Framework for Financial Market Analysis and Forecasting |
2025/03 |
– |
Guidance |
Demonstration |
Segment |
During Task |
Experimental Exploration: Investigating Cooperative Interaction Behavior Between Humans and Large Language Model Agents |
2025/03 |
– |
Implicit |
User Action |
Segment |
During Task |
InteractGen: Enhancing Human-Involved Embodied Task Reasoning through LLM-Based Multi-Agent Collaboration |
2024/12 |
– |
Guidance |
Demonstration |
Segment |
During Task |
Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts |
2022/04 |
– |
Corrective |
Refinement |
Segment |
During Task |
Drive As You Speak: Enabling Human-Like Interaction With Large Language Models in Autonomous Vehicles |
2023/09 |
– |
Guidance |
Demonstration |
Segment |
During Task |
AgentCoord: Visually Exploring Coordination Strategy for LLM-based Multi-Agent Collaboration |
2024/04 |
Link |
Guidance, Corrective |
Demonstration, Refinement |
Segment |
Initial Setup |
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation |
2025/04 |
Link |
Corrective, Implicit |
User Action, Refinement |
Segment |
During Task |
A Human-Computer Collaborative Tool for Training a Single Large Language Model Agent into a Network through Few Examples |
2024/04 |
– |
Corrective, Guidance |
Demonstration, Refinement |
Segment |
During Task |
LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination |
2024/01 |
– |
Guidance, Corrective |
Demonstration, Refinement |
Segment |
During Task |
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks |
2025/03 |
Link |
Corrective, Implicit |
Refinement, User Action |
Segment |
During Task |
An LLM-based approach for enabling seamless Human-Robot collaboration in assembly |
2024/04 |
– |
Guidance |
Demonstration |
Segment |
During Task |
REVECA: Adaptive Planning and Trajectory-based Validation in Cooperative Language Agents using Information Relevance and Relative Proximity |
2024/05 |
– |
Implicit |
Human Control |
Segment |
During Task |
AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment |
2024/09 |
Link |
Implicit |
Human Control |
Holistic |
Initial Setup |
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback |
2023/09 |
Link |
Evaluative |
Binary Assessment |
Holistic |
During Task |
Improving grounded language understanding in a collaborative environment by interacting with agents through help feedback |
2023/04 |
– |
Corrective, Guidance |
Demonstration, Refinement |
Holistic |
During Task |
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments |
2025/02 |
Link |
Guidance |
Demonstration, Critique |
Segment |
During Task |
Large Language Model-based Human-Agent Collaboration for Complex Task Solving |
2024/02 |
Link |
Implicit |
Human Control |
Segment |
During Task |
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration |
2025/02 |
Link |
Guidance |
Critique |
Holistic |
During Task |
LLM-Based Human-Robot Collaboration Framework for Manipulation Tasks |
2023/08 |
– |
Corrective, Guidance |
Demonstration, Refinement |
Segment |
During Task |
Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration |
2024/06 |
– |
Corrective |
Refinement |
Holistic |
During Task |
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks |
2025/01 |
Link |
Corrective, Guidance |
Refinement, Critique |
Holistic, Segment |
During Task, Post Task |
Embodied LLM Agents Learn to Cooperate in Organized Teams |
2024/03 |
Link |
Guidance |
Critique |
Holistic |
During Task |
Building cooperative embodied agents modularly with large language models |
2023/07 |
– |
Evaluative |
Scaler Rating |
Holistic |
Post Task |
Investigating Agency of LLMs in Human-AI Collaboration Tasks |
2023/05 |
Link |
Guidance |
Demonstration, Critique |
Segment |
During Task |
Human-LLM collaboration in generative design for customization |
2025/06 |
– |
Guidance, Evaluative |
Demonstration, Binary Assessment, Preference Ranking |
Holistic, Segment |
Initial Setup, Post Task |
PDFChatAnnotator: A Human-LLM Collaborative Multi-Modal Data Annotation Tool for PDF-Format Catalogs |
2024/04 |
– |
Corrective, Guidance |
Demonstration, Refinement |
Segment |
During Task |
To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions |
2024/10 |
Link |
Implicit, Guidance |
Demonstration, User Action |
Holistic |
Initial Setup, During Task |
Improved Trust in Human-Robot Collaboration With ChatGPT |
2023/06 |
– |
Guidance |
Demonstration, Critique |
Segment |
During Task |
Challenges in Human-Agent Communication |
2024/11 |
– |
Guidance |
Demonstration, Critique |
Holistic, Segment |
Initial Setup, During Task, Post Task |
Towards Modeling Human-Agentic Collaborative Workflows: A BPMN Extension |
2024/12 |
Link |
Guidance |
Demonstration |
Holistic |
Initial Setup, During Task, Post Task |
Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations |
2024/08 |
Link |
Guidance |
Demonstration |
Segment |
During Task |
Enhancing Human-Robot Collaborative Assembly in Manufacturing Systems Using Large Language Models |
2024/06 |
– |
Corrective, Guidance |
Demonstration, Refinement, Critique |
Segment |
Initial Setup, During Task |
A2C: A Modular Multi-stage Collaborative Decision Framework for Human-AI Teams |
2024/01 |
Link |
Guidance, Implicit, Corrective, Evaluative |
Refinement, Binary Assessment, Critique, Human Control |
Holistic, Segment |
During Task |
MindAgent: Emergent Gaming Interaction |
2023/09 |
Link |
Corrective |
Refinement |
Segment |
During Task |
Ask-before-Plan: Proactive Language Agents for Real-World Planning |
2024/01 |
Link |
Guidance |
Demonstration, Critique |
Segment |
During Task |
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents |
2023/10 |
– |
Evaluative, Implicit |
Scaler Rating, User Action |
Holistic, Segment |
During Task, Post Task |
PaLM-E: An Embodied Multimodal Language Model |
2023/04 |
Link |
Guidance, Implicit |
Demonstration, User Action |
Segment |
During Task |
Embodied Task Planning with Large Language Models |
2023/07 |
Link |
Guidance, Evaluative |
Demonstration, Binary Assessment |
Holistic, Segment |
Initial Setup, Post Task |
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework |
2023/08 |
Link |
Evaluative, Guidance |
Binary Assessment |
Holistic |
Initial Setup, Post Task |
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning |
2024/06 |
Link |
Evaluative |
Binary Assessment |
Holistic |
During Task, Post Task |
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue |
2024/02 |
Link |
Guidance |
Demonstration |
Holistic, Segment |
During Task |
Autonomous Evaluation and Refinement of Digital Agents |
2024/04 |
Link |
Evaluative |
Binary Assessment |
Holistic |
Post Task |
WebCanvas: Benchmarking Web Agents in Online Environments |
2024/06 |
Link |
Evaluative |
Scaler Rating |
Holistic |
Post Task |
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft |
2025/04 |
Link |
Evaluative |
Scaler Rating |
Holistic |
Post Task |