With the development of combinatorial chemistry technology, large-scale screening compound libraries can be obtained through computer virtual synthesis using modern medicinal chemistry. In order to deal with a large amount of data, a series of tools integrating chemistry, mathematics, and computer science are required for the entire chemical information processing. Large scale molecular dynamics (MD) simulations produce an immense quantity of data, and principal component analysis is the most widely used multidimensional data analysis technique that has been commonly applied in the MD simulation. PCA is used to reduce the dimensionality of compound description factors, which can more simply and effectively express molecular information and reduce the complexity of calculations.
Figure 1. The molecular dynamics and simulation analysis of the MARK4 kinase-UBA domain. (D) The Principal Component Analysis (PCA) of the 10 ns trajectory: The projection of the first, fifth, tenth and twentieth Eigenvectors obtained from the protein coordinate matrix reveals the total collective motions of the protein and the graph denotes the stability of the predicted model over a 10 ns timescale of the MD simulation. (Jenardhanan, P.; et al. 2014)
Principles of PCA
PCA transforms a series of potentially correlated variables into a set of linearly uncorrelated variables through orthogonal transformation. The transformed set of variables is called principal components. In this way, multiple variables can be converted into 2D/3D variables, and the data can be reduced in dimensionality without reducing the variables.
- We first remove the translation and overall rotation from the trajectory using Cartesian coordinates.
- Then, millisecond molecular dynamics simulations of the folding of villin headpiece and the functional dynamics of BPTI are adopted.
- Molecular dynamics trajectories of corresponding atoms are extracted and analysis carried out for the last 5 ns (10-15 ns) using 500 frames.
- Finally, the conformational distribution can be obtained from a Cartesian PCA to reflect the dominant overall motion rather than the much smaller internal motion of the protein.
Application of Our PCA
- Analyze the motions of flexible regions in proteins.
- Detect the ill-equilibrated regions of a protein.
- Detect the important motions in biomolecules ranging from proteins to nucleic acids.
- Discriminate relevant conformational changes in a protein from the background of atomic fluctuations combined with physical models of the protein motion.
- Principal components can be used to compare the motions of two MD trajectories and systematic displacements can then be identified.
- We are capable of detecting correlations in large data sets and automatically extracting information from a molecular dynamics simulation using PCA technique.
- To identify the overall patterns of motions in the models, we have developed a well-designed PCA process. We calculate the eigenvectors from the covariance matrix of a simulation, and filter the trajectories along each of the different eigenvectors. We therefore can identify the dominant motions observed during a simulation by visual inspection.
- In addition to Cartesian coordinates, our experts also apply backbone dihedral angles to perform principal component analysis of molecular dynamics simulations. Alfa Chemistry has applied this methodology to the construction of the free energy landscape of molecules.
Our principal component analysis (PCA) services remarkably reduce the cost, promote further experiments, and accelerate the process of drug design for customers worldwide. Our personalized and all-around services will satisfy your innovative study demands. If you are interested in our services, please don't hesitate to contact us. We are glad to cooperate with you and witness your success!
- Jenardhanan, P.; et al. The structural analysis of MARK4 and the exploration of specific inhibitors for the MARK family: a computational approach to obstruct the role of MARK4 in prostate cancer progression. Molecular Biosystems. 2014, 10(7): 1845-1868.