Agent Adapter

Summary: An agent adapter provides interface to connect the agent and external tools for task completion.

Context: An agent may leverage external tools to complete certain tasks for expanded capabilities.

Problem: The agent needs to deal with different interfaces of diverse tools, while certain interfaces might be incompatible or inefficient to interact for the agent. How can the agent assign tasks to external tools and process the results?

Forces:

Interoperability. Certain tasks require external tools to complete, and the tools may need agents to process particular information during intermediate steps.
Adaptability. Agents may employ new tools considering task complexity, tool capability, cost, etc.
Overhead. Manual development of compatible interfaces for agents and external tools can be intensive and inefficient.

Solution: Fig. 1 demonstrates a simplified graphical representation of agent adapter. Given user’s instructions, the agent generates a plan consisting of a set of tasks to achieve the user’s goals. In particular, the agent may employ diverse external tools to complete different tasks. However, tools have respective interfaces, which can be of different abstraction levels for the agent to deal with, or have specific format requirements, etc. Agent adapter can help invoke and manage these interfaces by converting the agent messages into required format or content, and vice versa. In particular, the adapter can retrieve tool manual or tutorial from datastore, to acquire available interfaces. It then transforms the agent outputs based on the interface requirements, and invokes the service [68]. Please note that fine-grained interface description can help agent to understand and hence improve the performance. The adapter also receives execution results from tools, which will be sent to the underlying foundation model for further analysis (e.g. task assignment to other tools, self-reflection for tool employment). For instance, the adapter can translate tasks into system messages when interacting with local file system, or capture and operate graphical user interface when playing a video game.

Benefits:

Interoperability. Agent adapter facilitates the interoperation between an agent and external tools.
Adaptability. Agents can employ new tools via agent adapter, which can acquire and convert the tool API via corresponding manual or tutorial.
Reduced development cost. Agent adapter enables autonomous conversion of interfaces, there is no need to develop compatible interfaces for different tools, hence the development cost is reduced.

Drawbacks:

Maintenance overhead. i) Agent adapter itself requires proper maintenance and evaluation to ensure the correctness of outputs. ii) Agent adapter may need additional memory or external data store to record the historical tool interfaces.

Known uses:

AutoGen. Users can register different tools in the agent, specifying the usage description. Registered tools will be leveraged by the agent during a conversation with user.
Apple Intelligence. Apple Intelligence can support writing, image generation, schedule management across different products and applications. For instance, it can capture the entities in users’ photo library and create emoji.
Semantic Kernel. Semantic Kernel can orchestrate agents and plugins to extend agents’ skills. Plugins need to provide semantic description (e.g. input, output, side effects) for agents to understand.
Yang et al. [1] devise SWE-agent that can provide agent-computer interfaces, enabling foundation modelbased agents to process code commands and resolve software engineering tasks.

Related patterns:

Prompt/response optimiser. Prompt/response optimiser can improve users’ inputs, and the optimised prompts can be sent to other agents for goal achievement, while agent adapter focuses more on the utilisation of external tools.
Tool/agent registry. Tool/agent registry records the available external tools, while agent adapter can convert the interface of selected tools into agent-friendly format.

References:

[1] J. Yang, C. E. Jimenez, A. Wettig, K. Lieret, S. Yao, K. Narasimhan, and O. Press, “Swe-agent: Agent-computer interfaces enable automated software engineering,” arXiv preprint arXiv:2405.15793, 2024.