Project 6

August 7th, 2023

The Unification of Structured Multi-Modal Knowledge Base and Multi-Modal Foundation Models

Project location:

This project can be worked on form any site in any state

Desirable skills:

  • Good programming skills in python or other languages
  • Solid understanding of database, foundation models and knowledge graph

Supervisory project team:

Qing Liu, Zhenchang Xing and Xin Yuan (Data61) in conjunction with Qian Fu, Jieshan Chen, Wenjie Zhang (UNSW) and Liang Zheng, Yuan-Sen Ting, Zongyou Yin (ANU)

Contact person:

Senior Research Scientist, Data61

Project description:

The progress of large multi-modal models has opened unprecedented opportunities to revolutionize many fields of science. However, these black-box models have been criticized for lacking factual knowledge that users often experience hallucinations. On the other hand, structured knowledge bases (KBs, including knowledge graphs and databases) not only offer facts but also can provide interpretability which is critical for scientific domains. However, KBs are difficult to construct to handle dynamic nature of real-world facts and complex relations. Furthermore, most of the existing knowledge graphs are represented with pure symbols denoted in the form of text, which weakens the capability of machines to describe and understand the real world. For example, it is necessary for many scientific domains (e.g., Astronomy, Chemistry etc.) to ground symbols to corresponding images, sound and video data to form the hypothesis, theory and experiments. Similar problems can also be found in the existing LLMs. This project will recruit 3~4 PhD students to investigate how to unify multi-modal foundation models with multi-modal KBs to leverage their own strengths and overcome limitations. The 4 main activities include:


(1) Foundation Model-Augmented Knowledge Base: How to bridge the gap between natural language questions and structured knowledge base for complex fact-based QA reasoning?
a. Develop LLM-augmented multi-modal knowledge graphs.
b. Develop prompt framework using LLM-augmented KGs and large multi-modal foundation models for multi-hop QA Reasoning.
c. Develop prototype for the chosen science digital domain to demonstrate the ideas.

(2) Knowledge Base-enhanced Foundation Models: How to effectively and efficiently manage multi-modal knowledge base to augment large multi-modal foundation model capabilities?
a. Develop a multi-modal embedding method for vector store for the chosen scientific domain.
b. Investigate efficient knowledge handling using vector store and graph database.
c. Integrate the above methods into the science digital framework.

(3) How to apply a unified structured Multi-Modal Knowledge Base with a Multi-Modal Foundation Model in the astronomy domain
a. Use the output from the first two for real-time data analysis and interpretation of astronomical observations. These algorithms will enable the swift identification of celestial phenomena, supporting astronomers in uncovering hidden patterns and accelerating the process of discovery.

(4) How to apply a unified structured Multi-Modal Knowledge Base with a Multi-Modal Foundation Model in the Chemistry/Energy domain?
a. Build high-resolution image and time sequence analysis tools based on our multi-modal knowledge graph and a multi-modal foundation model for electrolysis processes, optimizing the efficiency of hydrogen production, which plays a pivotal role in sustainable energy conversion and storage. We anticipate ground-breaking outcomes that will allow for rapid responses to transient events, deepen our understanding of the cosmos, and facilitate the transition to a sustainable energy ecosystem.

Outcome:
(1) The output of the project will marry the merits of both worlds to contribute to Sigma Eight framework. By effective both knowledge-based and data inference, the scientific discovery can be accelerated through both generative and reasoning capabilities in future.
(2) We will collaborate with Prof. Wenjie Zhang (UNSW) who is a leading researcher in data management, Assoicate Professor Yuan-Sen Ting, Dr Zongyou Yin and Professpr Liang Zheng (ANU, School of Astronomy, School of Chemistry, School of Computing) who will provide domain requirements and data in Astronomy and Chemistry to test the idea. Will invite Energy BU collaborators after the project is approved.
(3) In addition to the prototype and paper publication, another key expected outcome of this project is fostering meaningful collaboration with scientific domain experts. By working collectively with the team members, the students will develop problem-solving, time management and communication skills in this dynamic and diverse working environment.