Combining machine learning and metabolic modelling to optimise metabolite production in Escherichia coli

Project: ResearchIndividual grant


A widespread challenge in the production of natural chemicals is the conversion of living organisms in efficient cell factories. One of the main issues is that natural products of interest are often the result of underground or lowly activated pathways. Genetic and cell culture bioengineering can be used to induce over-production of the desired metabolites, but – given the large number of variables in play - identifying the optimal biosynthetic conditions is generally costly and time-consuming. In this context, building advanced bioinformatics methods for guiding the design of bioengineering interventions can dramatically accelerate the availability of new molecules. In particular, data-driven computational analysis can effectively tackle the complexity of biomolecular systems and has the potential to effectively address this problem.
We will develop and validate a bioinformatics pipeline for the computational prediction of metabolite production in Escherichia coli, the most widely used cellular factory at FUJIFILM Diosynth Biotechnologies. To this end, we will combine genome-scale metabolic modelling (GSMM) and a data-driven prediction algorithm based on machine learning methods. GSMM will allow us to capture the metabolic flux configuration of E. coli associated to every biosynthetic condition by incorporating experimental data on multiple levels into integrative poly-omics models. The machine learning approach will then integrate GSMM-based information and the associated experimentally-generated data to accurately estimate the yield of target metabolites.
The pipeline will be applied to the exploration of putative bioengineering steps for the activation of unproductive biosynthetic pathways in E. coli. Particularly, we will focus on the identification of the optimal conditions for the production of molecules for therapeutic use. This work will thus generate a novel tool to assist the optimisation of cell factories. Furthermore, it will elucidate the integration of complementary computational approaches applicable to the production of other natural compounds in other organisms.
Effective start/end date22/06/1828/02/19