Agent Design Pattern Catalogue
Motivation
Being the technical backbones of the highly disruptive generative artificial intelligence (GenAI) technologies, foundation models (FMs) have received a vast amount of attention from academia and industries [1]. Specifically, the emergence of large language models (LLMs) with their remarkable capabilities to understand and generate human-like reasoning and content has sparked the growth of a diverse range of downstream tasks using language models. Subsequently, there is a rapidly growing interest in the development of FM-based autonomous agents, e.g., AutoGPT and BabyAGI, which can take a proactive, autonomous role to pursue users’ goals. This goal could be broad given by human, necessitating the agents to derive their autonomy from the capabilities of FMs, enabling them to segregate the goal into a set of executable tasks and orchestrate task execution to fulfill the goal. During the reasoning process, humans can also provide feedback on instrumental goals, revise a multi-step plan derived by the agent, correct intermediate results, or even refine a plan/goal during execution based on early outcomes.
While huge efforts have been put into this merging field, there is a steep learning curve for practitioners to build and implement FM-based agents. We noticed that there are a series of reusable solutions that can be grouped into patterns to address the diverse challenges in designing FM-based agents, however, the architecture design and architectural patterns collection of the agents have not been systematically explored and formulated. Furthermore, the design of systems that integrate agents is non-trivial and complex, especially in how to select appropriate design decisions to fulfill different software quality requirements and design constraints. Further, multi-agent systems may require additional considerations on the coordination and interactions of agents, for instance, collusion between agents, and correlated failures [2]. We list several challenges in developing and implementing FM-based agents as follows:
- Agents often struggle to fully comprehend and execute complex tasks, leading to the potential for inaccurate responses. This challenge may be intensified by the inherent reasoning uncertainties during plan generation and action procedures. For instance, across a long-term planning, the included steps may depend on each other, even slight deviation to a few steps can significantly impact the overall success rate.
- Agents should not be entirely blamed for inaccurate response, since users may provide limited context, ambiguous goals or unclear instructions during the interaction with agents, which will result in underspecification [3, 4] in the reasoning process and response generation of agents.
- The sophisticated internal architecture of agents and foundation models results in limited explainability, making them “black boxes” to stakeholders. Consequently, agents often struggle to interpret their reasoning steps, which can affect the reliability, robustness, and overall trustworthiness of agent systems.
- The accountability process is complicated due to the interactions between various stakeholders, FM-based agents, non-agent AI models, and non-AI software applications within the overall ecosystem. Highly autonomous agents may delegate or even create other agents or tools for certain tasks. In this circumstance, responsibility and accountability may be intertwined among multiple entities.
In this regard, we present a catalogue of patterns for the design of foundation model-based agents in this paper. Please note that “agent” can be referred to i) AI acting on behalf of another entity, or; ii) AI that can take active roles or produces effect to achieve users’ goals. The former circumstance requires thorough analysis on governance perspective, while hereby, we claim that in this study, we focus on the second concept of “agents” that are capable of goal-seeking and plan generation. In software engineering, an architectural pattern is a reusable solution to a problem that occurs commonly within a given context in software design. Our pattern catalogue includes 18 patterns that were identified based on the study conducted by Lu et al. [5]. The intended audience of collected patterns is software architects and developers who are interested in FM-based agent design and implementation.
Agent Design Pattern Catalogue
Fig. 1 illustrates the ecosystem of foundation model-based agents, the agent components and interactions between different entities are annotated with the relevant patterns. When users interact with the agent, passive goal creator and proactive goal creator can help comprehend users’ intentions and environmental information, and formalised the eventual goals in context engineering, while prompt/response optimiser refines the prompts or instructions to other agents/tools based on the predefined templates for certain format or content requirements. Given users’ input, the agent fetches additional context information from the knowledge base via retrieval augmented generation. Then, it constructs plans to decompose the ultimate goals into actionable tasks through single-path plan generator and multi-path plan generator. In this process, one-shot model querying and incremental model querying may be carried out.
A generated plan should be reviewed to ensure its accuracy, usability, completeness, etc. Self-refection, cross-reflection, and human reflection can help the agent to collect feedback from different reflective entities, and refine the plan and reasoning steps accordingly. Afterwards, the agent can assign tasks to other narrow AI-based or non-AI systems, invoke external tools, and employ a set of agents for goal achievement by tool/agent registry. In particular, agents can work on the same task and finalise the results with voting-based, role-based, or debate-based cooperation. For instance, agents can act as different roles such as coordinator and worker. Agent adapter keeps learning the interfaces of different tools, and convert them into FM-friendly environment. Multimodal guardrails can be applied to manage and control the inputs/outputs of foundation models. Please note that we omit the detailed architecture of agent-as-a-worker, and pattern application in several interactions for the simplicity of this diagram, for instance, the invocation of non-agent-AI/non-AI systems and external tools services can all apply tool/agent registry, and agent-as-a-worker can all have their respective datastore or knowledge base. Finally, we claim that developers can evaluate the performance of agents at both design-time and runtime via agent evaluator.
- Passive Goal Creator
- Proactive Goal Creator
- Prompt/Response Optimiser
- Retrieval Augmented Generation
- One-Shot Model Querying
- Incremental Model Querying
- Single-Path Plan Generator
- Multi-Path Plan Generator
- Self-Reflection
- Cross-Reflection
- Human Reflection
- Voting-based Cooperation
- Role-based Cooperation
- Debate-based Cooperation
- Multimodal Guardrails
- Tool/Agent Registry
- Agent Adapter
- Agent Evaluator
References
[1] R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill et al., “On the opportunities and risks of foundation models,” arXiv preprint arXiv:2108.07258, 2021.
[2] U. Anwar, A. Saparov, J. Rando, D. Paleka, M. Turpin, P. Hase, E. S. Lubana, E. Jenner, S. Casper, O. Sourbut et al., “Foundational challenges in assuring alignment and safety of large language models,” arXiv preprint arXiv:2404.09932, 2024.
[3] A. Chan, R. Salganik, A. Markelius, C. Pang, N. Rajkumar, D. Krasheninnikov, L. Langosco, Z. He, Y. Duan, M. Carroll, M. Lin, A. Mayhew, K. Collins, M. Molamohammadi, J. Burden, W. Zhao, S. Rismani, K. Voudouris, U. Bhatt, A. Weller, D. Krueger, and T. Maharaj, “Harms from increasingly agentic algorithmic systems,” in Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, ser. FAccT ’23. New York, NY, USA: Association for Computing Machinery, 2023, p. 651–666. [Online]. Available: https://doi.org/10.1145/3593013.3594033
[4] A. D’Amour, K. Heller, D. Moldovan, B. Adlam, B. Alipanahi, A. Beutel, C. Chen, J. Deaton, J. Eisenstein, M. D. Hoffman, F. Hormozdiari, N. Houlsby, S. Hou, G. Jerfel, A. Karthikesalingam, M. Lucic, Y. Ma, C. McLean, D. Mincu, A. Mitani, A. Montanari, Z. Nado, V. Natarajan, C. Nielson, T. F. Osborne, R. Raman, K. Ramasamy, R. Sayres, J. Schrouff, M. Seneviratne, S. Sequeira, H. Suresh, V. Veitch, M. Vladymyrov, X. Wang, K. Webster, S. Yadlowsky, T. Yun, X. Zhai, and D. Sculley, “Underspecification presents challenges for credibility in modern machine learning,” Journal of Machine Learning Research, vol. 23, no. 226, pp. 1–61, 2022. [Online]. Available: http://jmlr.org/papers/v23/20-1335.html
[5] Q. Lu, L. Zhu, X. Xu, Z. Xing, S. Harrer, and J. Whittle, “Towards responsible generative ai: A reference architecture for designing foundation model based agents,” ICSA’23, 2023.
Related Projects
Our Papers
- Responsible AI Pattern Catalogue and Question Bank:
- Qinghua Lu, Liming Zhu, Xiwei Xu, Jon Whittle, Didar Zowghi, Aurelie Jacquet. Responsible AI Pattern Catalogue: A Multivocal Literature Review. ACM Computing Surveys, 2023.
- Qinghua Lu, Liming Zhu, Xiwei Xu, Jon Whittle. Responsible-AI-by-Design: a Pattern Collection for Designing Responsible AI Systems. IEEE Software, 2023.
- Qinghua Lu, Yuxiu Luo, Liming Zhu, Mingjian Tang, Xiwei Xu, Jon Whittle. Developing Responsible Chatbots for Financial Services: A Pattern-Oriented Responsible AI Engineering Approach. IEEE Intelligent Systems, 2023.
- Boming Xia, Tingting Bi, Zhenchang Xing, Qinghua Lu, Liming Zhu. An Empirical Study on Software Bill of Materials: Where We Stand and the Road Ahead. 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE’2023).
- Boming Xia, Qinghua Lu, Harsha Perera, Liming Zhu, Zhenchang Xing, Yue Liu, Jon Whittle. Towards Concrete and Connected AI Risk Assessment (C2AIRA): A Systematic Mapping Study. 2023 ACM/IEEE 2nd International Conference on AI Engineering (CAIN’2023).
- Sung Une Lee, Harsha Perera, Boming Xia, Yue Liu, Qinghua Lu, Liming Zhu, Olivier Salvado, Jon Whittle. QB4AIRA: A Question Bank for AI Risk Assessment. arXiv preprint arXiv:2305.09300, 2023.
- Boming Xia, Qinghua Lu, Liming Zhu, Zhenchang Xing. Towards AI Safety: A Taxonomy for AI System Evaluation. arXiv preprint arXiv:2404.05388, 2024.
- Software Engineering for Responsible AI:
- Qinghua Lu, Liming Zhu, Xiwei Xu, Jon Whittle, Zhenchang Xing, Towards a Roadmap on Software Engineering for Responsible AI. 2022 ACM/IEEE 1st International Conference on AI Engineering (CAIN’2022). ACM SIGSOFT Distinguished Paper Award.
- Qinghua Lu, Liming Zhu, Xiwei Xu, Jon Whittle, David Douglas, Conrad Sanderson. Software engineering for responsible AI: An empirical study and operationalised patterns. 2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).
- Architecture Design for Foundation Models based AI Systems:
-
- Qinghua Lu, Liming Zhu, Xiwei Xu, Zhenchang Xing, Jon Whittle. Towards Responsible AI in the Era of Generative AI: A Reference Architecture for Designing Foundation Model based Systems. arXiv preprint arXiv:2304.11090, 2023.
- Qinghua Lu, Liming Zhu, Xiwei Xu, Zhenchang Xing, Jon Whittle. A Taxonomy of Foundation Model based Systems through the Lens of Software Architecture. arXiv preprint arXiv:2305.05352, 2023.
- Qinghua Lu, Liming Zhu, Xiwei Xu, Zhenchang Xing, Stefan Harrer, Jon Whittle. Towards Responsible Generative AI: A Reference Architecture for Designing Foundation Model based Agents. arXiv preprint arXiv:2311.13148, 2024.
- Yue Liu, Sin Kit Lo, Qinghua Lu, Liming Zhu, Dehai Zhao, Xiwei Xu, Stefan Harrer, Jon Whittle. Agent Design Pattern Catalogue: A Collection of Architectural Patterns for Foundation Model based Agents. arXiv preprint arXiv:2405.10467, 2024.
Contact
Qinghua Lu: qinghua.lu@data61.csiro.au
Copyright (c) 2024 Commonwealth Scientific and Industrial Research Organisation (CSIRO) ABN 41 687 119 230.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International.