What Is De novo Design?
As an important part of drug discovery, de novo design refers to a process of automatically proposing new chemical structures and the required molecular characteristics in the best way. Therefore, de novo design is also called generative chemistry. A promising target can achieve the desired biological effect while maintaining acceptable pharmacokinetic properties in drug discovery. With the advancement of machine learning (ML) and artificial intelligence (AI), various computational methods for de novo design have been developed. ML and AI continue to provide new possibilities for searching chemical spaces, de novo design has also received wide attention recently.
Figure 1. Input and Output contents of the de novo drug design module. (Dominique, D. 2010)
Methodologies in De novo Design
Atom-based de novo design
Mutation and crossover operations on candidate molecular libraries are performed, while ensuring that the most optimized molecules remain in the population.
Fragment-based de novo design
Simple rules are used to deconstruct molecules, and then each fragment library containing one or more atoms is applied to construct new molecules.
Reaction-based de novo design
An algorithm and a reaction library are utilized to perform a forward reaction in the computer.
Our De novo Design Process
At Alfa Chemistry, we mainly apply fragment-based de novo design to support your drug discovery. Our process is as follows:
Find all matching chemical bonds
The matching of chemical bonds is developed based on the overlap of the spatial positions of the chemical bonds, and the structural fragments in the reference molecular library that can be used for transplantation are found through the method of chemical bond matching.
We provide two sets of methods to judge whether the chemical bonds match, they are the chemical bond matching method based on the atom pair and the chemical bond matching method based on the bond center.
Our experts use the Retrosynthetic combinatorial analysis procedure (RECAP) to filter the matching chemical bonds, and select the easily broken chemical bonds for structural tailoring.
Evaluate the affinity of each fragment
The empirical scoring function piecewise linear pairwise (PLP) is used to predict the affinity of the structure fragments obtained after tailoring according to the analysis of the target protein structure.
Transplant fragments with high-affinity
By comparing with the prediction results of the affinity of the corresponding structural fragments on the lead compound, we transplant the high-affinity structural fragments onto the lead compound to generate new molecules as the result of the structural optimization of the lead compound.
New molecule generation
According to the result of structural optimization, lead compounds are put in the active pocket of the target protein for energy optimization, prediction of their affinity with the target protein, cluster analysis and filter for drug-like properties.
1) Energy optimization
For the newly generated ligand molecular structure, the Tripos or AMBER molecular force field is used to optimize the energy of the tailored and transplanted new molecule in the active pocket of the target protein, so as to obtain a more reasonable binding mode of the new molecule and the target protein.
2) Prediction of affinity with the target protein
We use the experience-based scoring function PLP to predict the affinity of molecules.
3) Cluster analysis
For new molecules produced after tailoring and transplantation, cluster analysis is performed according to structural similarity. Some typical structures can be quickly found during this process.
4) Drug-like filtering
We use various parameters including molecular weight, the number of heavy atoms, the number of hydrogen bond donors, the number of hydrogen bond acceptors, the number of hydrogen bond donors, the number of rotatable bonds, the number of rings, and the logP value to filter new molecules produced by tailoring and transplantation.
These newly generated molecules can also be used as the starting structure for a new round of tailoring and transplantation, and continue to be optimized to obtain molecules with higher affinity.
Our Capabilities for De novo Design
Deep generative model
Alfa Chemistry supports multiple models based on recurrent neural networks (RNN), autoencoders (AE), generative adversarial networks (GAN), transformer models, and hybrid models combining deep generative models and reinforcement learning. The molecular representation in the generative model can be in any form, including chemical fingerprints, simplified molecular input line input system (SMILES), molecular diagrams, three-dimensional structures, etc.
Reinforcement learning (RL) method
We use RL to generate a library of molecules with specified attributes. Our scientists are able to design a library of compounds with specific properties, such as important molecular pharmacological physical characteristics, specific biological activities, and chemical complexity.
Visualization of chemical molecule library
At Alfa Chemistry, the distribution of compounds in the chemical space according to the chemical diversity and changes in the corresponding physical and chemical properties can be observed clearly.
Alfa Chemistry's Advantages
- A flexible and convenient reference molecule library to construct highly druggable molecules
The chemical fragments we use to build molecules are all taken from an external 'reference molecule library' designated by our clients. Since the chemical fragments used are all taken from real molecules, the generated molecules have natural advantages in terms of the rationality of the chemical structure and the possibility of synthesis.
- Customized de novo drug design
Our clients can also construct a special reference molecule library according to the characteristics of the target protein to realize customized drug design. The molecular design scheme will fully reflect the characteristics of these molecules and is expected to increase the success rate of the synthesis.
- Super-high execution efficiency
Alfa Chemistry's algorithm design avoids the 'combination explosion' problem in the process of assembling fragments because we do not use pre-made fragment libraries. In addition, all the reference molecules have been placed in the binding site of the target protein through molecular docking in advance, and there is no need to search for the dihedral conformation of new chemical bonds during the process of assembling the molecules.
Our de novo design services remarkably reduce the cost, promote further experiments, and accelerate the process of drug design for customers worldwide. Our personalized and all-around services will satisfy your innovative study demands. If you are interested in our services, please don't hesitate to contact us. We are glad to cooperate with you and witness your success!
- Dominique, D. e-LEA3D: a computational-aided drug design web server. Nucleic Acids Research. 2010, 38: 615-621.