Collaborative Crystal Conditions

Protein crystallography is used extensively in drug development, as understanding the structure-activity relationship between the protein and the ligand enables a rational approach to drug design. Obtaining a good X-ray diffracting crystal of the macromolecule is a major bottleneck in the process, as it requires an exhaustive sampling and subsequent optimisation of the physicochemical space to find the particular crystallisation conditions for each protein. This project aims to develop a CINTEL approach to predict the conditions in which a certain protein will crystallise. 


Crystallizing a protein is a physical process driven mostly by favourable protein-protein interactions at the crystal contacts, modulated by additives, salts and other environmental factors. Previous attempts using ML/AI and other statistical methods have been unsuccessful, manifesting the fact that ML/AI cannot by itself, encompass the complexity and diversity of the problem. These previous works highlight the information gap between the amino acids sequence and the crystal, which can only be bridged by including a knowledge-based guidance of a researcher. Instead of leaping from sequence to the conditions space, we propose a new approach where several mid-steps are introduced in which the researcher can inform the ML/AI and critically analyse the decisions taken. Because the researcher will have finer information at each step, they can provide a physicochemical insight to the ML/AI strategy to improve the predictions.  


Our group is in a privileged position of having an extensive dataset of positive but also negative crystallographic attempts, a vast expertise in protein crystallography and an exceptional high-throughput experimental setup. We have the data to train the ML/AI at each step, the scientific expertise to guide and inform the ML/AI models and the experimental setup to test the predictions. The successful application of this CINTEL-FSP project will have an impact beyond the BU and CSIRO, expanding the current limits of protein crystallography and demonstrating the potential of protein surface modelling.