Agent Design Pattern Catalogue

This image has an empty alt attribute; its file name is ecosystem.jpg
Figure 1. Ecosystem of FM-based agent systems annotated with architectural patterns in gray boxes.

Motivation

Being the technical backbones of the highly disruptive generative artificial intelligence (GenAI) technologies, foundation models (FMs) have received a vast amount of attention from academia and industries [1]. Specifically, the emergence of large language models (LLMs) with their remarkable capabilities to understand and generate human-like reasoning and content has sparked the growth of a diverse range of downstream tasks using language models. Subsequently, there is a rapidly growing interest in the development of FM-based autonomous agents, e.g., AutoGPT and BabyAGI, which can take a proactive, autonomous role to pursue users’ goals. This goal could be broad given by human, necessitating the agents to derive their autonomy from the capabilities of FMs, enabling them to segregate the goal into a set of executable tasks and orchestrate task execution to fulfill the goal. During the reasoning process, humans can also provide feedback on instrumental goals, revise a multi-step plan derived by the agent, correct intermediate results, or even refine a plan/goal during execution based on early outcomes.

While huge efforts have been put into this merging field, there is a steep learning curve for practitioners to build and implement FM-based agents. We noticed that there are a series of reusable solutions that can be grouped into patterns to address the diverse challenges in designing FM-based agents, however, the architecture design and architectural patterns collection of the agents have not been systematically explored and formulated. Furthermore, the design of systems that integrate agents is non-trivial and complex, especially in how to select appropriate design decisions to fulfill different software quality requirements and design constraints. Further, multi-agent systems may require additional considerations on the coordination and interactions of agents, for instance, collusion between agents, and correlated failures [2]. We list several challenges in developing and implementing FM-based agents as follows:

  • Agents often struggle to fully comprehend and execute complex tasks, leading to the potential for inaccurate responses. This challenge may be intensified by the inherent reasoning uncertainties during plan generation and action procedures. For instance, across a long-term planning, the included steps may depend on each other, even slight deviation to a few steps can significantly impact the overall success rate.
  • Agents should not be entirely blamed for inaccurate response, since users may provide limited context, ambiguous goals or unclear instructions during the interaction with agents, which will result in underspecification [3, 4] in the reasoning process and response generation of agents.
  • The sophisticated internal architecture of agents and foundation models results in limited explainability, making them “black boxes” to stakeholders. Consequently, agents often struggle to interpret their reasoning steps, which can affect the reliability, robustness, and overall trustworthiness of agent systems.
  • The accountability process is complicated due to the interactions between various stakeholders, FM-based agents, non-agent AI models, and non-AI software applications within the overall ecosystem. Highly autonomous agents may delegate or even create other agents or tools for certain tasks. In this circumstance, responsibility and accountability may be intertwined among multiple entities.

In this regard, we present a catalogue of patterns for the design of foundation model-based agents in this paper. Please note that “agent” can be referred to i) AI acting on behalf of another entity, or; ii) AI that can take active roles or produces effect to achieve users’ goals. The former circumstance requires thorough analysis on governance perspective, while hereby, we claim that in this study, we focus on the second concept of “agents” that are capable of goal-seeking and plan generation. In software engineering, an architectural pattern is a reusable solution to a problem that occurs commonly within a given context in software design. Our pattern catalogue includes 18 patterns that were identified based on the study conducted by Lu et al. [5]. The intended audience of collected patterns is software architects and developers who are interested in FM-based agent design and implementation.

Agent Design Pattern Catalogue

Fig. 1 illustrates the ecosystem of foundation model-based agents, the agent components and interactions between different entities are annotated with the relevant patterns. When users interact with the agent, passive goal creator and proactive goal creator can help comprehend users’ intentions and environmental information, and formalised the eventual goals in context engineering, while prompt/response optimiser refines the prompts or instructions to other agents/tools based on the predefined templates for certain format or content requirements. Given users’ input, the agent fetches additional context information from the knowledge base via retrieval augmented generation. Then, it constructs plans to decompose the ultimate goals into actionable tasks through single-path plan generator and multi-path plan generator. In this process, one-shot model querying and incremental model querying may be carried out.

A generated plan should be reviewed to ensure its accuracy, usability, completeness, etc. Self-refection, cross-reflection, and human reflection can help the agent to collect feedback from different reflective entities, and refine the plan and reasoning steps accordingly. Afterwards, the agent can assign tasks to other narrow AI-based or non-AI systems, invoke external tools, and employ a set of agents for goal achievement by tool/agent registry. In particular, agents can work on the same task and finalise the results with voting-based, role-based, or debate-based cooperation. For instance, agents can act as different roles such as coordinator and worker. Agent adapter keeps learning the interfaces of different tools, and convert them into FM-friendly environment. Multimodal guardrails can be applied to manage and control the inputs/outputs of foundation models. Please note that we omit the detailed architecture of agent-as-a-worker, and pattern application in several interactions for the simplicity of this diagram, for instance, the invocation of non-agent-AI/non-AI systems and external tools services can all apply tool/agent registry, and agent-as-a-worker can all have their respective datastore or knowledge base. Finally, we claim that developers can evaluate the performance of agents at both design-time and runtime via agent evaluator.

References

[1] R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill et al., “On the opportunities and risks of foundation models,” arXiv preprint arXiv:2108.07258, 2021.

[2] U. Anwar, A. Saparov, J. Rando, D. Paleka, M. Turpin, P. Hase, E. S. Lubana, E. Jenner, S. Casper, O. Sourbut et al., “Foundational challenges in assuring alignment and safety of large language models,” arXiv preprint arXiv:2404.09932, 2024.

[3] A. Chan, R. Salganik, A. Markelius, C. Pang, N. Rajkumar, D. Krasheninnikov, L. Langosco, Z. He, Y. Duan, M. Carroll, M. Lin, A. Mayhew, K. Collins, M. Molamohammadi, J. Burden, W. Zhao, S. Rismani, K. Voudouris, U. Bhatt, A. Weller, D. Krueger, and T. Maharaj, “Harms from increasingly agentic algorithmic systems,” in Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, ser. FAccT ’23. New York, NY, USA: Association for Computing Machinery, 2023, p. 651–666. [Online]. Available: https://doi.org/10.1145/3593013.3594033

[4] A. D’Amour, K. Heller, D. Moldovan, B. Adlam, B. Alipanahi, A. Beutel, C. Chen, J. Deaton, J. Eisenstein, M. D. Hoffman, F. Hormozdiari, N. Houlsby, S. Hou, G. Jerfel, A. Karthikesalingam, M. Lucic, Y. Ma, C. McLean, D. Mincu, A. Mitani, A. Montanari, Z. Nado, V. Natarajan, C. Nielson, T. F. Osborne, R. Raman, K. Ramasamy, R. Sayres, J. Schrouff, M. Seneviratne, S. Sequeira, H. Suresh, V. Veitch, M. Vladymyrov, X. Wang, K. Webster, S. Yadlowsky, T. Yun, X. Zhai, and D. Sculley, “Underspecification presents challenges for credibility in modern machine learning,” Journal of Machine Learning Research, vol. 23, no. 226, pp. 1–61, 2022. [Online]. Available: http://jmlr.org/papers/v23/20-1335.html

[5] Q. Lu, L. Zhu, X. Xu, Z. Xing, S. Harrer, and J. Whittle, “Towards responsible generative ai: A reference architecture for designing foundation model based agents,” ICSA’23, 2023.

Related Projects

Our Papers

Contact

Qinghua Lu: qinghua.lu@data61.csiro.au

 

Copyright (c) 2024 Commonwealth Scientific and Industrial Research Organisation (CSIRO) ABN 41 687 119 230.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International.