Deep Learning of Reaction Barriers for High-Throughput Retrosynthetic Drug Design at University of Warwick

Job Description

Supervisors:

Supervisors:  Prof. Reinhard J. Maurer (Chemistry/Physics), Prof. Scott Habershon (Chemistry)

Summary:

The drug discovery pipeline involves the screening of many molecules before viable leads are identified. This involves screening for their pharmacological properties, but also for their synthetic viability. Typical drug molecules can contain up to 100 non-hydrogen atoms, which makes the development of cost-effective and efficient synthetic pathways very challenging. Therefore, high-throughput screening of drug-like molecules needs to also consider their synthetic viability. The aim of this project is to develop a deep learning and generative design toolchain to accurately predict chemical reaction barriers that will advance chemical retrosynthetic design workflows.

In the exploratory phase of drug discovery, millions of molecules are screened for their viability as drugs. This involves screening for their pharmacological properties, but also for their synthetic viability. Typical drug molecules can contain up to 100 non-hydrogen atoms, which makes the development of cost-effective and efficient synthetic pathways very challenging. Effective retrosynthetic design requires the ability to predict accurate reaction enthalpies and activation free energies for relevant intermediates. While quantum chemical predictions typically can provide sufficient accuracy of prediction (~1kcal/mol error), they are not feasible at the scale of millions of predictions per day. The need to predict the transition state structure as input for quantum chemical barrier predictions adds further complications. Machine learning (ML) models of quantum chemistry can achieve fast and accurate predictions, but comprehensive data sets for reaction barriers of large molecules simply do not exist.

Several recent works have tried to tackle the scarcity of data on reaction barriers by creating new curated data sets, but data for large molecules remains scarce. Furthermore, entropic and solvent effects will play a crucial role in reactions of large drug molecules and need to be considered. Graph-based reaction discovery and generative machine learning provide a path to new synthetic data that can form the basis for a large-scale database of reaction enthalpies and activation free energies for realistic molecules.

In this project, the student will develop a deep learning and generative design toolchain to accurately predict chemical reaction barriers without recourse to transition state structures and quantum chemical calculations at the point of prediction. This will enable the development of more accurate and advanced synthesis planning.

https://warwick.ac.uk/fac/sci/hetsys/themes/projectopportunities

Additional Funding Information

Awards for both UK residents and international applicants pay a stipend to cover maintenance as well as paying the university fees and a research training support. The stipend is at the standard UKRI rate.

For more details visit: https://warwick.ac.uk/fac/sci/hetsys/apply/funding/


Location