A basic problem in computational chemistry is to find a set of reactants to synthesize the target molecule, that is, reverse syntheses prediction. As computers are widely used in various fields, the search space for all possible conversions is very large. This technique is widely applied in drug discovery. Scholars have been exploring multiple computer-aided reverse syntheses route analysis. Nowadays, synthetic route planning relies heavily on the knowledge of experienced chemists. The development of modern computers has allowed machine learning to be successfully applied to reverse syntheses route prediction. Computer-aided retrosynthetic route planning can be divided into two categories: template-based methods and template-free methods. In addition to template-based technique, Alfa Chemistry also applies the template-free automatic retrosynthetic route planning strategy to support your computer-aided chemical drug synthesis.
Figure 1. The workflow of AutoSynRoute. (Kang, Li.; et al. 2020)
At Alfa Chemistry, we propose a new template-free strategy for automatic reverse synthesis route planning. The process of template-free automatic retrosynthetic route design can be described as follows:
1. First, the Transformer architecture is constructed to train the end-to-end model of the single-step reverse syntheses route on the reaction from the USPTO database.
We have built the Transformer architecture based entirely on the self-attention mechanism which is good at capturing the internal correlation of data or features.
The model architecture uses two-way long and short-term memory (LSTM) units of the attention mechanism.
The Transformer architecture supports parallel computing which can significantly increase training time.
It can generate highly effective smiles strings through the effective calculation of long-range dependent sequences.
The reactants and products of each reaction are extracted by using rdkit to mark molecules as model inputs, and they are converted into SMILES string successfully.
2. Then, the Monte Carlo Tree Search (MCTS) method is applied to search for intermediate molecules.
Selection step: Traverse the search tree from the root node to the leaf node by selecting the child node with the largest upstream confidence interval (UCB) score.
Extension step: Create child nodes by sampling from the Transformer model.
Simulation step: Evaluate each searched position in the state space to determine the best position, and then perform the search from this position until the target is found to create a path to the terminal node.
Backpropagation: Calculate the reward of the terminal node and update the UCB score of the upstream node.
3. According to the obtained heuristic score, the reaction path of the molecule that finally forms the root node is obtained.
Figure 2. Monte Carlo tree search for retrosynthetic pathway search. (Kang, Li.; et al. 2020)
Our template-free automatic retrosynthetic route planning services remarkably reduce the cost, promote further experiments, and accelerate the process of drug design for customers worldwide. Our personalized and all-around services will satisfy your innovative study demands. If you are interested in our services, please don't hesitate to contact us. We are glad to cooperate with you and witness your success!