Awesome-LLM-Based-Human-Agent-System-Papers

Awesome LLM-Based Human-Agent Systems

Code Paper arXiv Maintenance Contribution Welcome

Oryx Video-ChatGPT

image

⭐ Star and stay tuned! :> Our survey paper has finally been published on arXiv after about 1.5 weeks on hold. We’ll keep adding new papers and resources here. We have a major update this Thursday to include more papers and further refine our survey.

😊 Feel free to star and fork this repository to follow the latest updates, and let us know if you have any suggestions, comments, or recommended papers!

🔥 News

🌟 Introduction

Welcome to the repository associated with our survey paper, “A Survey on Large Language Model based Human-Agent Systems”. This repository contains resources and updates related to our ongoing Human-Agent-System research. For a detailed introduction, please refer to our survey paper.

📄 Contents

image

🤝 Human Feedback

(©️click here back to table of contents👆🏻)

Title Date Code Feedback Type Feedback Subtype Feedback Granularity Feedback Phase
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration (Co-Gym) 2025/01 Link Corrective Refinement Segment During Task
Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task 2024/09 Corrective Refinement Segment During Task
FinArena: A Human-Agent Collaboration Framework for Financial Market Analysis and Forecasting 2025/03 Guidance Demonstration Segment During Task
Experimental Exploration: Investigating Cooperative Interaction Behavior Between Humans and Large Language Model Agents 2025/03 Implicit User Action Segment During Task
InteractGen: Enhancing Human-Involved Embodied Task Reasoning through LLM-Based Multi-Agent Collaboration 2024/12 Guidance Demonstration Segment During Task
Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts 2022/04 Corrective Refinement Segment During Task
Drive As You Speak: Enabling Human-Like Interaction With Large Language Models in Autonomous Vehicles 2023/09 Guidance Demonstration Segment During Task
AgentCoord: Visually Exploring Coordination Strategy for LLM-based Multi-Agent Collaboration 2024/04 Link Guidance, Corrective Demonstration, Refinement Segment Initial Setup
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation 2025/04 Link Corrective, Implicit User Action, Refinement Segment During Task
A Human-Computer Collaborative Tool for Training a Single Large Language Model Agent into a Network through Few Examples 2024/04 Corrective, Guidance Demonstration, Refinement Segment During Task
LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination 2024/01 Guidance, Corrective Demonstration, Refinement Segment During Task
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks 2025/03 Link Corrective, Implicit Refinement, User Action Segment During Task
An LLM-based approach for enabling seamless Human-Robot collaboration in assembly 2024/04 Guidance Demonstration Segment During Task
REVECA: Adaptive Planning and Trajectory-based Validation in Cooperative Language Agents using Information Relevance and Relative Proximity 2024/05 Implicit Human Control Segment During Task
AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment 2024/09 Link Implicit Human Control Holistic Initial Setup
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback 2023/09 Link Evaluative Binary Assessment Holistic During Task
Improving grounded language understanding in a collaborative environment by interacting with agents through help feedback 2023/04 Corrective, Guidance Demonstration, Refinement Holistic During Task
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments 2025/02 Link Guidance Demonstration, Critique Segment During Task
Large Language Model-based Human-Agent Collaboration for Complex Task Solving 2024/02 Link Implicit Human Control Segment During Task
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration 2025/02 Link Guidance Critique Holistic During Task
LLM-Based Human-Robot Collaboration Framework for Manipulation Tasks 2023/08 Corrective, Guidance Demonstration, Refinement Segment During Task
Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration 2024/06 Corrective Refinement Holistic During Task
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks 2025/01 Link Corrective, Guidance Refinement, Critique Holistic, Segment During Task, Post Task
Embodied LLM Agents Learn to Cooperate in Organized Teams 2024/03 Link Guidance Critique Holistic During Task
Building cooperative embodied agents modularly with large language models 2023/07 Evaluative Scaler Rating Holistic Post Task
Investigating Agency of LLMs in Human-AI Collaboration Tasks 2023/05 Link Guidance Demonstration, Critique Segment During Task
Human-LLM collaboration in generative design for customization 2025/06 Guidance, Evaluative Demonstration, Binary Assessment, Preference Ranking Holistic, Segment Initial Setup, Post Task
PDFChatAnnotator: A Human-LLM Collaborative Multi-Modal Data Annotation Tool for PDF-Format Catalogs 2024/04 Corrective, Guidance Demonstration, Refinement Segment During Task
To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions 2024/10 Link Implicit, Guidance Demonstration, User Action Holistic Initial Setup, During Task
Improved Trust in Human-Robot Collaboration With ChatGPT 2023/06 Guidance Demonstration, Critique Segment During Task
Challenges in Human-Agent Communication 2024/11 Guidance Demonstration, Critique Holistic, Segment Initial Setup, During Task, Post Task
Towards Modeling Human-Agentic Collaborative Workflows: A BPMN Extension 2024/12 Link Guidance Demonstration Holistic Initial Setup, During Task, Post Task
Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations 2024/08 Link Guidance Demonstration Segment During Task
Enhancing Human-Robot Collaborative Assembly in Manufacturing Systems Using Large Language Models 2024/06 Corrective, Guidance Demonstration, Refinement, Critique Segment Initial Setup, During Task
A2C: A Modular Multi-stage Collaborative Decision Framework for Human-AI Teams 2024/01 Link Guidance, Implicit, Corrective, Evaluative Refinement, Binary Assessment, Critique, Human Control Holistic, Segment During Task
MindAgent: Emergent Gaming Interaction 2023/09 Link Corrective Refinement Segment During Task
Ask-before-Plan: Proactive Language Agents for Real-World Planning 2024/01 Link Guidance Demonstration, Critique Segment During Task
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents 2023/10 Evaluative, Implicit Scaler Rating, User Action Holistic, Segment During Task, Post Task
PaLM-E: An Embodied Multimodal Language Model 2023/04 Link Guidance, Implicit Demonstration, User Action Segment During Task
Embodied Task Planning with Large Language Models 2023/07 Link Guidance, Evaluative Demonstration, Binary Assessment Holistic, Segment Initial Setup, Post Task
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework 2023/08 Link Evaluative, Guidance Binary Assessment Holistic Initial Setup, Post Task
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning 2024/06 Link Evaluative Binary Assessment Holistic During Task, Post Task
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue 2024/02 Link Guidance Demonstration Holistic, Segment During Task
Autonomous Evaluation and Refinement of Digital Agents 2024/04 Link Evaluative Binary Assessment Holistic Post Task
WebCanvas: Benchmarking Web Agents in Online Environments 2024/06 Link Evaluative Scaler Rating Holistic Post Task
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft 2025/04 Link Evaluative Scaler Rating Holistic Post Task

🔄 Interaction

(©️click here back to table of contents👆🏻)

Title Date Code Interaction Types Interaction Variant
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration (Co-Gym) 2025/01 Link Collaboration Supervision, Delegation
Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task 2024/09 - Collaboration Coordination
FinArena: A Human-Agent Collaboration Framework for Financial Market Analysis and Forecasting 2025/03 - Collaboration Delegation
Experimental Exploration: Investigating Cooperative Interaction Behavior Between Humans and Large Language Model Agents 2025/03 - Coopetition -
InteractGen: Enhancing Human-Involved Embodied Task Reasoning through LLM-Based Multi-Agent Collaboration 2024/12 - Collaboration Cooperation, Delegation, Coordination
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts 2022/04 - Collaboration Delegation
Drive As You Speak: Enabling Human-Like Interaction With Large Language Models in Autonomous Vehicles 2023/09 - Collaboration Delegation
AgentCoord: Visually Exploring Coordination Strategy for LLM-based Multi-Agent Collaboration 2024/04 Link Collaboration Coordination
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation 2025/04 Link Collaboration Supervision, Delegation, Coordination
A Human-Computer Collaborative Tool for Training a Single Large Language Model Agent into a Network through Few Examples 2024/04 - Collaboration Delegation, Supervision
LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination 2024/01 - Collaboration Supervision, Delegation
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks 2025/03 Link Collaboration Delegation
An LLM-based approach for Enabling Seamless Human-Robot Collaboration in Assembly 2024/04 - Collaboration Delegation
REVECA: Adaptive Planning and Trajectory-based Validation in Cooperative Language Agents Using Information Relevance and Relative Proximity 2024/05 - Collaboration Delegation
AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment 2024/09 Link Collaboration Delegation
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback 2023/09 Link Collaboration Delegation
Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents through Help Feedback 2023/04 - Collaboration Delegation
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments 2025/02 Link Collaboration Supervision, Delegation
Large Language Model-based Human-Agent Collaboration for Complex Task Solving 2024/02 Link Collaboration Delegation
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration 2025/02 Link Collaboration -
LLM-Based Human-Robot Collaboration Framework for Manipulation Tasks 2023/08 - Collaboration Supervision, Delegation
Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration 2024/06 - Collaboration Delegation
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-Agent Tasks 2025/01 Link Collaboration Coordination, Cooperation
Embodied LLM Agents Learn to Cooperate in Organized Teams 2024/03 Link Collaboration Delegation
Building Cooperative Embodied Agents Modularly with Large Language Models 2023/07 - Collaboration Cooperation
Investigating Agency of LLMs in Human-AI Collaboration Tasks 2023/05 Link Collaboration Cooperation, Delegation
Human-LLM Collaboration in Generative Design for Customization 2025/06 - Collaboration Delegation
PDFChatAnnotator: A Human-LLM Collaborative Multi-Modal Data Annotation Tool for PDF-Format Catalogs 2024/04 - Collaboration Delegation
To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions 2024/10 Link Collaboration Coordination
Improved Trust in Human-Robot Collaboration With ChatGPT 2023/06 - Collaboration Delegation
Challenges in Human-Agent Communication 2024/11 - Collaboration Delegation
Towards Modeling Human-Agentic Collaborative Workflows: A BPMN Extension 2024/12 Link Collaboration Coordination
Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations 2024/08 Link Collaboration Delegation
Enhancing Human-Robot Collaborative Assembly in Manufacturing Systems Using Large Language Models 2024/06 - Collaboration Delegation
A2C: A Modular Multi-stage Collaborative Decision Framework for Human-AI Teams 2024/01 Link Collaboration Coordination
MindAgent: Emergent Gaming Interaction 2023/09 Link Collaboration Coordination
Ask-before-Plan: Proactive Language Agents for Real-World Planning 2024/01 Link Collaboration Coordination
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents 2023/10 - Collaboration, Competition, Coopetition Coordination
PaLM-E: An Embodied Multimodal Language Model 2023/04 Link Collaboration Delegation
Embodied Task Planning with Large Language Models 2023/07 Link Collaboration Delegation
MetaGPT: Meta Programming for a Multi-Agent Collaborative Framework 2023/08 Link Collaboration Coordination
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning 2024/06 Link Collaboration Delegation
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue 2024/02 Link Collaboration Delegation
Autonomous Evaluation and Refinement of Digital Agents 2024/04 Link Collaboration Delegation
WebCanvas: Benchmarking Web Agents in Online Environments 2024/06 Link Collaboration Delegation
MineWorld: A Real-Time and Open-Source Interactive World Model on Minecraft 2025/04 Link Collaboration Delegation

🎛️ Orchestration

(©️click here back to table of contents👆🏻)

Title Date Code Orchestration Strategy Orchestration Synchronization
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration (Co-Gym) 2025/01 Link One-by-One Asynchronous
Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task 2024/09 - Simultaneous Synchronous
FinArena: A Human-Agent Collaboration Framework for Financial Market Analysis and Forecasting 2025/03 - One-by-One Synchronous
Experimental Exploration: Investigating Cooperative Interaction Behavior Between Humans and Large Language Model Agents 2025/03 - One-by-One Synchronous
InteractGen: Enhancing Human-Involved Embodied Task Reasoning through LLM-Based Multi-Agent Collaboration 2024/12 - One-by-One Asynchronous
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts 2022/04 - One-by-One Synchronous
Drive As You Speak: Enabling Human-Like Interaction With Large Language Models in Autonomous Vehicles 2023/09 - One-by-One Synchronous
AgentCoord: Visually Exploring Coordination Strategy for LLM-based Multi-Agent Collaboration 2024/04 Link One-by-One Synchronous
CoWPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation 2025/04 Link One-by-One Synchronous
A Human-Computer Collaborative Tool for Training a Single Large Language Model Agent into a Network through Few Examples 2024/04 - One-by-One Synchronous
LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination 2024/01 - Simultaneous Synchronous
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks 2025/03 Link One-by-One Synchronous
An LLM-based approach for Enabling Seamless Human-Robot Collaboration in Assembly 2024/04 - One-by-One Synchronous
REVECA: Adaptive Planning and Trajectory-based Validation in Cooperative Language Agents Using Information Relevance and Relative Proximity 2024/05 - One-by-One Asynchronous
AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment 2024/09 Link One-by-One Asynchronous
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback 2023/09 Link One-by-One Synchronous
Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents through Help Feedback 2023/04 - One-by-One Synchronous
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments 2025/02 Link One-by-One Asynchronous
Large Language Model-based Human-Agent Collaboration for Complex Task Solving 2024/02 Link One-by-One Synchronous
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration 2025/02 Link Simultaneous Asynchronous
LLM-Based Human-Robot Collaboration Framework for Manipulation Tasks 2023/08 - One-by-One Synchronous
Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration 2024/06 - One-by-One Synchronous
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-Agent Tasks 2025/01 Link One-by-One Asynchronous
Embodied LLM Agents Learn to Cooperate in Organized Teams 2024/03 Link One-by-One Synchronous
Building Cooperative Embodied Agents Modularly with Large Language Models 2023/07 - Simultaneous Synchronous
Investigating Agency of LLMs in Human-AI Collaboration Tasks 2023/05 Link One-by-One Synchronous
Human-LLM Collaboration in Generative Design for Customization 2025/06 - One-by-One Synchronous
PDFChatAnnotator: A Human-LLM Collaborative Multi-Modal Data Annotation Tool for PDF-Format Catalogs 2024/04 - One-by-One Synchronous
To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions 2024/10 Link One-by-One Synchronous
Improved Trust in Human-Robot Collaboration With ChatGPT 2023/06 - One-by-One Synchronous
Challenges in Human-Agent Communication 2024/11 - One-by-One Synchronous
Towards Modeling Human-Agentic Collaborative Workflows: A BPMN Extension 2024/12 Link One-by-One Asynchronous
Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations 2024/08 Link One-by-One Synchronous
Enhancing Human-Robot Collaborative Assembly in Manufacturing Systems Using Large Language Models 2024/06 - One-by-One Synchronous
A2C: A Modular Multi-stage Collaborative Decision Framework for Human-AI Teams 2024/01 Link One-by-One Asynchronous
MindAgent: Emergent Gaming Interaction 2023/09 Link One-by-One Synchronous
Ask-before-Plan: Proactive Language Agents for Real-World Planning 2024/01 Link One-by-One Synchronous
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents 2023/10 - One-by-One Synchronous
PaLM-E: An Embodied Multimodal Language Model 2023/04 Link One-by-One Synchronous
Embodied Task Planning with Large Language Models 2023/07 Link One-by-One Asynchronous
MetaGPT: Meta Programming for a Multi-Agent Collaborative Framework 2023/08 Link One-by-One Asynchronous
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning 2024/06 Link One-by-One Asynchronous
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue 2024/02 Link One-by-One Synchronous
Autonomous Evaluation and Refinement of Digital Agents 2024/04 Link One-by-One Asynchronous
WebCanvas: Benchmarking Web Agents in Online Environments 2024/06 Link One-by-One Synchronous
MineWorld: A Real-Time and Open-Source Interactive World Model on Minecraft 2025/04 Link One-by-One Synchronous

💬 Communication

(©️click here back to table of contents👆🏻)

Title Date Code Communication Structure Communication Mode
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration (Co-Gym) 2025/01 Link Decentralized Conversation
Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task 2024/09 - Decentralized Conversation
FinArena: A Human-Agent Collaboration Framework for Financial Market Analysis and Forecasting 2025/03 - Hierarchical Conversation
Experimental Exploration: Investigating Cooperative Interaction Behavior Between Humans and Large Language Model Agents 2025/03 - Decentralized Conversation
InteractGen: Enhancing Human-Involved Embodied Task Reasoning through LLM-Based Multi-Agent Collaboration 2024/12 - Decentralized Message Pool
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts 2022/04 - Hierarchical Conversation
Drive As You Speak: Enabling Human-Like Interaction With Large Language Models in Autonomous Vehicles 2023/09 - Centralized Conversation
AgentCoord: Visually Exploring Coordination Strategy for LLM-based Multi-Agent Collaboration 2024/04 Link Decentralized Conversation
CoWPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation 2025/04 Link Decentralized Conversation
A Human-Computer Collaborative Tool for Training a Single Large Language Model Agent into a Network through Few Examples 2024/04 - Decentralized Conversation
LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination 2024/01 - Hierarchical Conversation
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks 2025/03 Link Decentralized Conversation
An LLM-based approach for Enabling Seamless Human-Robot Collaboration in Assembly 2024/04 - Centralized Conversation
REVECA: Adaptive Planning and Trajectory-based Validation in Cooperative Language Agents Using Information Relevance and Relative Proximity 2024/05 - Hierarchical Observation
AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment 2024/09 Link Decentralized Message Pool
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback 2023/09 Link Decentralized Conversation
Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents through Help Feedback 2023/04 - Decentralized Conversation
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments 2025/02 Link Decentralized Conversation
Large Language Model-based Human-Agent Collaboration for Complex Task Solving 2024/02 Link Decentralized Conversation
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration 2025/02 Link Decentralized Observation
LLM-Based Human-Robot Collaboration Framework for Manipulation Tasks 2023/08 - Decentralized Conversation
Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration 2024/06 - Decentralized Conversation
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-Agent Tasks 2025/01 Link Decentralized Conversation
Embodied LLM Agents Learn to Cooperate in Organized Teams 2024/03 Link Decentralized Conversation
Building Cooperative Embodied Agents Modularly with Large Language Models 2023/07 - Decentralized Conversation
Investigating Agency of LLMs in Human-AI Collaboration Tasks 2023/05 Link Decentralized Conversation
Human-LLM Collaboration in Generative Design for Customization 2025/06 - Decentralized Conversation
PDFChatAnnotator: A Human-LLM Collaborative Multi-Modal Data Annotation Tool for PDF-Format Catalogs 2024/04 - Decentralized Conversation
To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions 2024/10 Link Decentralized Observation
Improved Trust in Human-Robot Collaboration With ChatGPT 2023/06 - Decentralized Conversation
Challenges in Human-Agent Communication 2024/11 - Decentralized Conversation
Towards Modeling Human-Agentic Collaborative Workflows: A BPMN Extension 2024/12 Link Decentralized Conversation
Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations 2024/08 Link Decentralized Conversation
Enhancing Human-Robot Collaborative Assembly in Manufacturing Systems Using Large Language Models 2024/06 - Decentralized Conversation
A2C: A Modular Multi-stage Collaborative Decision Framework for Human-AI Teams 2024/01 Link Decentralized Conversation
MindAgent: Emergent Gaming Interaction 2023/09 Link Centralized Conversation
Ask-before-Plan: Proactive Language Agents for Real-World Planning 2024/01 Link Centralized Conversation
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents 2023/10 - Centralized Conversation
PaLM-E: An Embodied Multimodal Language Model 2023/04 Link Decentralized Conversation
Embodied Task Planning with Large Language Models 2023/07 Link Decentralized Conversation
MetaGPT: Meta Programming for a Multi-Agent Collaborative Framework 2023/08 Link Decentralized Message Pool
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning 2024/06 Link Decentralized Conversation
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue 2024/02 Link Decentralized Conversation
Autonomous Evaluation and Refinement of Digital Agents 2024/04 Link Decentralized Conversation
WebCanvas: Benchmarking Web Agents in Online Environments 2024/06 Link Decentralized Conversation
MineWorld: A Real-Time and Open-Source Interactive World Model on Minecraft 2025/04 Link Decentralized Conversation

📚 Applications, Datasets & Benchmarks

(©️click here back to table of contents👆🏻)

Domain Datasets & Benchmarks Proposed or Used by Data Link
Embodied AI TaPA Wu et al., 2023 Link
Embodied AI EmboInteract Sun et al., 2024b
Embodied AI AssistantX Sun et al., 2024
Embodied AI IGLU Multi-Turn Mehta et al., 2024 Link
Embodied AI PARTNR Chang et al., 2024 Link
Embodied AI MINT Wang et al., 2024 Link
Embodied AI C-WAH Seo et al., 2025 Link
Conversational Systems WEBLINX Lu et al., 2024
Conversational Systems Ask-before-Plan Zhang et al., 2024 Link
Conversational Systems Agency Dialogue Sharma et al., 2024
Conversational Systems WildSeek Jiang et al., 2024 Link
Conversational Systems MINT Wang et al., 2024 Link
Conversational Systems HOTPOTQA Feng et al., 2024 Link
Conversational Systems StrategyQA Feng et al., 2024 Link
Software Development MINT Wang et al., 2024 Link
Software Development InterCode Feng et al., 2024 Link
Software Development ColBench Zhou et al., 2025 Link
Software Development ConvCodeWorld Han et al., 2025 Link
Software Development ConvCodeBench Han et al., 2025 Link
Gaming CuisineWorld Gong et al., 2023 Link
Gaming MineWorld Guo et al., 2025 Link
Finance FinArena-Low-Cost Xu et al., 2025 Link

📌 Contributing

(©️click here back to table of contents👆🏻)

Contributions are welcome! If you have relevant papers, code, or insights, feel free to submit a request 🤗.

📝 Citation

(©️click here back to table of contents👆🏻)

If you find our survey helpful, please consider citing our work 💕:

@misc{zou2025surveylargelanguagemodel,
      title={A Survey on Large Language Model based Human-Agent Systems}, 
      author={Henry Peng Zou and Wei-Chieh Huang and Yaozu Wu and Yankai Chen and Chunyu Miao and Hoang Nguyen and Yue Zhou and Weizhi Zhang and Liancheng Fang and Langzhou He and Yangning Li and Yuwei Cao and Dongyuan Li and Renhe Jiang and Philip S. Yu},
      year={2025},
      eprint={2505.00753},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.00753}, 
}

⭐ Star History

Star History Chart