Abstract
By a News Reporter-Staff News Editor at Robotics & Machine Learning Daily News Daily News-According to news reporting based on a preprint abstract, our journalists obtained the following quote sourced from bi orxiv.org: "Understanding the linkage between protein sequence and phenotypic expression le vel is crucial in biotechnology. Machine learning algorithms trained with deep m utational scanning (DMS) data have significant potential to improve this underst anding and accelerate protein engineering campaigns. "However, most machine learning (ML) approaches in this domain do not directly a ddress effects of synonymous codons or positional epistasis on predicted express ion levels. "Here we used yeast surface display, deep mutational scanning, and next-generati on DNA sequencing to quantify the expression fitness landscape of human myoglobi n and train ML models to predict epistasis of double codon mutants. When fed wit h near comprehensive single mutant DMS data, our algorithm computed expression f itness values for double codon mutants using ML-predicted epistasis as an interm ediate parameter. We next deployed this predictive model to screen > 3 {middle dot} 106 unseen double codon mutants in silico and exp erimentally tested highly ranked candidate sequences, finding 14 of 16 with sign ificantly enhanced expression levels.