Machine learning in medical diagnostic and drug discovery
Dr Minh Duc Cao is a principal machine learning scientist at Gritstone Oncology Inc where he leads a team developing deep learning models for designing personalized cancer therapy. He previously was a senior bioinformatics scientist at 4Catalyzer, a startup founded by technology pioneer Professor Jonathon Rothberg that aims to transform medicine with devices, deep learning and the cloud computing. Dr. Cao completed his PhD in Bioinformatics at Monash University where he applied advanced statistics and information theory into modelling genomics data. During this time, he developed state‐of‐the‐arts genomic analysis methods, including data compression, sequence alignment and phylogenetics analysis. He then carried out his postdoctoral research and worked as a research scientist at the University of Queensland, applying machine learning and sequencing technologies to investigate genomics variation that underlies human diseases. He has been contributing to the development of third generation sequencing technology by developing novel methodologies to improve the quality of sequencing data and to translate sequencing technologies into practice. Dr. Cao has also consulted on deployment of high performance computing facilities for genomics research.
Advances on biotechnology have profoundly transformed science and medicine and been making impacts in saving human lives over the last 15 years in ways never seen before. This has since generated enormous amounts of data and is projected to be the biggest data source on the planet by 2025. The sheer volume of life science data, together with their noisiness present a big challenge to the making sense of the data. The era also coincides with the age of big data where advanced analytics are widely developed and adopted in industry but are hardly utilized for biological data. In this seminar, I will share my own experience in applying machine learning into genomics and proteomics data for diagnostic and drug discovery. I show that not only are these techniques more effective than classical approaches, but they also help to reduce cost and time of sequencing projects, and hence contribute to translating sequencing technologies into practice.