Project 8

August 6th, 2023

ExperimentGPT: An AI Assistant that Can Read the Scientist’s Mind

Project location: Pullenvale (QLD)

Desirable skills:

  • A background in computer science or electrical engineering.
  • Knowledge and experience in using machine learning and deep learning models with expertise in generative modelling.
  • Experience in building deep models, programming in python, and working with HPC.

Supervisory project team:

Abdelwahed Khamis, Yang Li, Sara Khalifa and Norman Zhang

Contact person:

Research Scientist, Data61

Project description:

In one sentence, leveraging recent advances in AI (generative and foundational models) and sensing technologies to enable a ChatGPT-like experience for scientific experiment conduction. 

Scientific discovery hinges on proper data collection and powerful data analysis. Recently, we are witnessing advances on the two fronts. There is a surge in new data recording and sensing technologies combined with powerful data analysis Machine Learning tools. Yet, the actual process for conducting data-driven scientific experiments is decades old and lacks the automation that has revolutionized other fields. This is an issue that cuts across many scientific disciplines.  We aim to leverage the recent advances in AI (foundational and diffusion models) to automate scientific experiment conduction. The vision is that a scientist can ask an AI Assistant, ExperimentGPT, to automate his experiment by simply writing down the experimental setup, the needed sensors, and potentially the physical principles that govern the data. ExperimentGPT will produce the deployable ML models matching the experiment description.

To this end, we envision a zero-effort framework for end-to-end ML model synthesis from textual experiment descriptions. This goes all the way from sensory data generation to ML model training and optimization for deployment. The ultimate goal is broken into two sub-projects that can run and be tested continually in parallel and be integrated into the final demonstration.

Sub-project #1: In the first subproject, we translate the scientific experimental specification to sensory data. From which could acquire data stream (along with proper labels) to be used later for synthesizing ML models. We can leverage the recent advances in generative modelling [4] that can account for the natural variability of animal motions [1] and its integration of multi-modal foundational models [3] for powerful natural text embedding.

Sub-project #2: In the second sub-project, ML models will be synthesized from the generated multimodal data provided by subproject #1. The synthesized ML models should rectify potential inaccuracies of generated sensory data and, also, should be amenable for deployment on edge devices in the field. For the first, physics- informed machine learning [2] can be utilized. It ensures that the translation honours the physical first principles.

Outcomes

Science Innovation: Automating scientific experiments conduct in the way we outline here is a niche problem. This opens the door for many innovation opportunities. We expect high-quality publishable outcomes by the PhD students including collaboration with the supervisory team in top vision and sensing venues (Sensys, Mobicom, ICCV, etc). We also expect this to turn the attention to overlooked research areas and inspire follow-up works.

References
[1] Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, and Mubarak Shah. Diffusion models
in vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
[2] George Em Karniadakis, Ioannis G Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning. Nature Reviews Physics, 3(6):422–440, 2021.
[3] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
[4] Yaqi Zhang, Di Huang, Bin Liu, Shixiang Tang, Yan Lu, Lu Chen, Lei Bai, Qi Chu, Nenghai Yu, and Wanli Ouyang. Motiongpt: Finetuned llms are general-purpose motion generators, 2023.