Cross-Reflection

Summary:  Cross-reflection uses different agents or foundation models to provide feedback and refine the generated plan and corresponding reasoning procedure.

Context: The agent generates a plan to achieve users’ goals, while the quality of this devised plan should be assessed.

Problem: When an agent has limited capability and cannot conduct reflection with satisfying performance, how to evaluate the output and reasoning steps of this agent?

Forces:

  • Reasoning uncertainty. The inconsistencies and errors in the agent’s reasoning process may reduce response accuracy and affect the overall trustworthiness.
  • Lack of explainability. The trustworthiness of the agent can be disturbed by the issue of transparency and explainability of how the plan is generated.
  • Limited capability. An agent may not be able to perform reflection well due to its limited capability and the complexity of self-reflection.

Solution: Fig. 1 includes a high-level graphical representation of cross-reflection. If an agent cannot generate accurate results or precise planning steps via reflecting its outputs, users can prompt the agent to query another agent which is specialised in reflection. The latter agent can review and evaluate the outputs and relevant reasoning steps of the original agent, and provide refinement suggestions. This process can be iterative until the reflective agent confirms the plan. In addition, multiple agents can be queried for reflection to generate comprehensive responses.

Figure 1. Plan reflection pattern.

Benefits:

  • Reasoning certainty. The agent’s outputs and respective methodology are assessed and refined by other agents to ensure the reasoning certainty and response accuracy.
  • Explainability. Multiple agents can be employed to review the reasoning process of the original agent, providing thorough explanations to the user.
  • Inclusiveness. The reflective feedback includes different reasoning outputs when multiple agents are queried, which can help formalise a comprehensive refinement suggestion.
  • Scalability. Cross-reflection supports scalable agent-based systems as the reflective agents can be flexibly updated without disrupting the system operation.

Drawbacks:

  • Reasoning uncertainty. The overall response quality and reliability are dependent on the performance of other reflective agents.
  • Fairness preservation. When various agents participate in the reflection process, a critical issue would be how to preserve fairness among all the provided feedback.
  • Complex accountability. If the cross-reflection feedback causes serious or harmful results, the accountability process may be complex when multiple agents are employed.
  • Overhead. i) There will be communication overhead for the interactions between agents. ii) Users may need to pay for utilising the reflective agents.

Known uses:

  • XAgent. In XAgent, the tool agent can send feedback and reflection to the plan agent to indicate whether a task is completed, or pinpoint the refinements.
  • Yao et al. [1] explore agents’ capability of learning through communicating with each other. A thinker agent can provide suggestions to an actor agent, who is responsible for decision-making.
  • Qian et al. [2] develop a virtual software development company based on agents, where the tester agents can detect bugs and report to programmer agents.
  • Talebirad and Nadiri [3] analyse the inter-agent feedback which involves criticism of each other, which can
    help agents adapt their strategies.

Related patterns:

References:

[1] W. Yao, S. Heinecke, J. C. Niebles, Z. Liu, Y. Feng, L. Xue, R. Murthy, Z. Chen, J. Zhang, D. Arpit et al., “Retroformer: Retrospective large language agents with policy gradient optimization,” arXiv preprint arXiv:2308.02151, 2023.

[2] C. Qian, X. Cong, C. Yang, W. Chen, Y. Su, J. Xu, Z. Liu, and M. Sun, “Communicative agents for software development,” arXiv preprint arXiv:2307.07924, 2023.

[3] Y. Talebirad and A. Nadiri, “Multi-agent collaboration: Harnessing the power of intelligent llm agents,” arXiv preprint arXiv:2306.03314, 2023.