Catalyst and reaction rate constant prediction methods of coupling reaction based on convolutional neural network
Cross-coupling reactions are one of the most effective methods of forming carbon-carbon bonds in modern organic synthesis.Effective screening and optimization of reaction conditions,such as catalysts,play an important role in improving the efficiency of drug and fine chemical development.In this work,the convolutional neural network models and methods based on an organic reaction database are developed for Suzuki-Miyaura and Buchwald-Hartwig cross-coupling reactions to predict suitable reaction catalysts(with ligands)and rate constants.A comparative model is also established based on the random forest algorithm.The results show that the catalyst prediction model based on the convolutional neural network can accurately recommend reaction catalysts with 85%of top 3 accuracy in the Suzuki-Miyaura cross-coupling reaction dataset,and 92%of top 3 accuracy in the Buchwald-Hartwig cross-coupling reaction dataset.After obtaining the catalyst recommended by the model,the ECFP4 molecular fingerprint and K-Means algorithm are used to cluster the reaction based on the structural characteristics of the catalyst,and on this basis the reaction rate constant is predicted.In order to create a reaction fingerprint that describes the entire reaction,the random number labels are generated from the catalyst text and then concatenated with the ECFP4 molecular fingerprint of the reactants and products.Rate constant prediction models are established based on the datasets and compared respectively.The results show that the performance of the rate constant prediction model using the clustering method is significantly improved on the two types of cross-coupling reaction datasets,which indicates that the reaction clustering method based on the structural characteristics of catalyst has a significant improvement in predicting the rate constant of the cross-coupling reaction.This cross-coupling reaction catalyst and rate constant prediction methods based on the convolutional neural network are expected to be applied to other organic synthesis reactions and further use the formed model for reaction condition control and optimization.
cross-coupling reactioncatalyst and ligand predictionprediction of reaction rate constantconvolutional neural network