Incremental Model Querying

Summary: Incremental model querying involves accessing the foundation model at each step of the plan generation process.

Context: When users interact with the agent for specific goals, the included foundation model is queried for plan generation.

Problem: The foundation model may struggle to generate the correct response at the first attempt. How can the agent conduct an accurate reasoning process?

Forces:

  • Size of the context window. The context window of a foundation model may be limited, hence users may not be able to provide a complete and comprehensive prompt.
  • Oversimplification. The reasoning process may be oversimplified and hence endure uncertainties with only one attempt of model querying.
  • Lack of explainability. The generated responses of foundation models require detailed reasoning process to
    preserve explainability and eventual trustworthiness.

Solution: Fig. 1 illustrates interactions between the users and agent within incremental model querying. The agent could engage in a step-by-step reasoning process to develop the plan for goal achievement with multiple queries to the foundation model. Meanwhile, human feedback can be provided at any time to both the reasoning process and generated plan, and adjustments can be made accordingly during model querying. Please note that incremental model querying can rely on a reusable template, which guides the process through context injection or an explicit workflow/plan repository and management system.

Figure 1. Incremental model querying.

Benefits:

  • Supplementary context. Incremental model querying allows users to split the context in multiple prompts to address the issue of limited context window.
  • Reasoning certainty. Foundation models will iteratively refine the reasoning steps by self-checking or feedback from users.
  • Explainability. Users can query the foundation model to provide detailed reasoning steps through incremental model querying.

Drawbacks:

  • Overhead. i) Incremental model querying requires multiple interactions with the foundation model, which may increase the time consumption for planning determination. ii) The high volume of user queries may be cost-intensive when utilising commercial foundation models.

Known uses:

  • HuggingGPT. The underlying foundation model of HuggingGPT is queried multiple times to decompose users’ requests into fine-grained tasks, and then determine the dependencies and execution orders of tasks [1].
  • EcoAssistant [2]. EcoAssistant applies a code executor interacting with the foundation model to iteratively refine code.
  • ReWOO [3]. ReWOO queries the foundation model to i) generate a list of interdependent plans, and; ii) combine the observation evidence fetched from tools with the corresponding task.

Related patterns:

  • One-shot model querying. Incremental model querying can be regarded an alternative of one-shot model querying with iteration.
  • Multi-path plan generator. The agent can capture users’ preferences at each step and generate multi-path plans by iteratively querying the foundation model.
  • Self-reflection. Self-reflection requires agents to query their incorporated foundation model multiple times for response review and evaluation.
  • Human-reflection. Human-reflection is enabled by incremental model querying for iterative communication between users/experts and the agent.
  • Multimodal guardrails. Multimodal guardrails serve as an intermediate layer, managing the inputs and outputs of model querying.

References:

[1] Y. Shen, K. Song, X. Tan, D. Li, W. Lu, and Y. Zhuang, “Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face,” in Advances in Neural Information Processing Systems, A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, Eds., vol. 36. Curran Associates, Inc., 2023, pp. 38 154–38 180. [Online]. Available: https://proceedings.neurips.cc/paper files/paper/2023/file/77c33e6a367922d003ff102ffb92b658-Paper-Conference.pdf
[2] J. Zhang, R. Krishna, A. H. Awadallah, and C. Wang, “Ecoassistant: Using llm assistant more affordably and accurately,” arXiv preprint arXiv:2310.03046, 2023.
[3] B. Xu, Z. Peng, B. Lei, S. Mukherjee, Y. Liu, and D. Xu, “Rewoo: Decoupling reasoning from observations for efficient augmented language models,” arXiv preprint arXiv:2305.18323, 2023.