Review Article |
Corresponding author: Frangky Sangande ( frangky.sangande@gmail.com ) Academic editor: Georgi Momekov
© 2022 Krisyanti Budipramana, Frangky Sangande.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Budipramana K, Sangande F (2022) Molecular docking-based virtual screening: Challenges in hits identification for Anti-SARS-Cov-2 activity. Pharmacia 69(4): 1047-1056. https://doi.org/10.3897/pharmacia.69.e89812
|
The emergence of severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2) requires finding new drugs or repurposing drugs for clinical use. Molecular docking belongs to structure-based drug design providing a fast method for identifying the hit compounds with antiviral activity against SARS-Cov-2. However, the weakness of the docking method is compounded by the limited crystallographic information and comparison drugs due to the novelty of this virus can present challenges in identifying hits of anti-SARS-Cov-2. In the current review, we highlighted several aspects, especially those related to the target structure, docking validation, and virtual hit selection, that need to be considered to obtain reliable docking results. Here, we discussed several cases pertaining to the issue highlighted and approaches that could be used to solve them.
consensus docking, hit criteria, homology, ligand efficiency, SARS-Cov-2
The drug discovery process and development are commonly time-consuming for almost 20 years and have an immoderate cost. Drug discovery can be initiated by screening millions of compounds into smaller sizes and, finally, the best compound, the hit or lead compound, is found. New methods are still being developed to shorten the time and save money (
Virtual screening comprises two approaches: structure-based drug design (SBDD) and ligand-based drug design (LBDD). LBDD is based on the principle that compounds with similar structures tend to exhibit similar biological activity. Thus, the screening will focus on finding compounds similar to the known active compounds. Meanwhile, in SBDD, it is assumed that biologically active compounds will be able to bind to target molecules (proteins, enzymes, DNA, RNA). Molecular docking is one of the structure-based virtual screening methods commonly used in drug discovery (
The emergence of novel coronavirus disease at the end of 2019 has opened a new chapter in the drug discovery field. Many studies have been carried out to find anti-SARS-Cov-2 drugs. Interestingly, the current treatment of SARS-Cov-2 is dominated by reusing existing drugs such as chloroquine/hydroxychloroquine, lopinavir/ritonavir, ribavirin, oseltamivir, remdesivir, and favipiravir (
According to our search in the PubMed search engine using “docking screening covid” as a query, there were 398 articles in 2020 and an increase to 709 articles in 2021, indicating that scientists are interested in applying the molecular docking method for anti-SARS-Cov-2 drugs discovery. Unfortunately, several articles have suggested that caution is required in concluding the molecular docking results because there are some limitations to molecular docking (
In structure-based drug design, molecular docking will model the interaction between a ligand and the target molecule and then calculate the complex binding energy to distinguish between the binder and non-binder ligands. For good results, this method requires the high-resolution 3D structure of the target generated from X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, or homology modeling (
In the case of SARS-Cov-2, the availability of some crystal proteins is still limited due to their novelty. Thus, the structures of these proteins were ordinarily generated using homology modeling based on the SARS-Cov structure as the template (
According to our findings, 11.6% of articles used homology structure in their docking studies for several targets such as TMPRSS2 (
Meanwhile, to answer the question of whether the general rule that only homology models built from templates with a sequence identity > 50% of the template are suitable for docking study,
Concerning SARS-Cov-2, we found several studies have already been carried out before the targets are available in the protein data bank. As a result, homology structures were used in their virtual screening process. Two studies compared their homology models of Mpro (
On the other hand, although some studies in our review have used the actual target structures, it still poses challenges, especially when dealing with the apo-form (
No | Targets | PDB ID |
---|---|---|
1 | Spike | 6LZG, 6M0J, 6M17, 6VSB, 6W41, 6X6P, 7BZ5 |
2 | ACE2 | 1R4L*, 1R42, 6LZG, 6M0J, 6VSB, 2AJF, 6VW1 |
3 | TMPRSS2 | 7MEQ* |
4 | RdRp | 6M71, 7BV2*, 7BW4 |
5 | Nucleocapsid | 7M4R, 6VYO, 6ZCO |
6 | Mpro | 6LU7*, 4MDS*, 6Y2E, 6Y2F*, 6Y2G*, 6Y7M*, 6Y84, 6YB7*, 5R7Y*, 5R80*, 5R82*, 5R84*, 5RF7*, 5RFS*, 6LZE*, 6M03, 6M0K*, 6M2N*, 6W63*, 7BQY*, 7BRP*, 7JYC* |
7 | Plpro | 4OVZ*, 4OW0*, 6W9C, 6WX4*, 7JN2*, 7JRN* |
8 | Nsp3 | 6W02*, 7BF6* |
9 | Nsp9 | 6W4B |
10 | Nsp10 | 6W4H*, 6YZ1* |
11 | Nsp15 | 6VWW, 6W01* |
12 | Nsp16 | 6W4H*, 6YZ1*, 6WKQ* |
Naturally, a protein may have several binding site shapes depending on the type of ligand bound to it as a result of the induced-fit effect and side-chain flexibility. For example, the crystal structure of monoamine oxidase B (MAO-B) complexed with pioglitazone (PDB: 4A79) has a binding site volume of 97 Å whereas when MAO-B complexed with 1,4-diphenyl-2-butene (PDB:1OJ9), the binding site volume increases to 289 Å (
Molecular docking consists of two main stages, namely sampling the binding pose of ligand in the active site of the macromolecule and predicting the binding energy expressed as a docking score for each pose using scoring functions (
The inaccuracy of the scoring function in molecular docking could be caused by neglecting some parameters (e.g., solvation effects, flexibility, polarization effects) required for the binding energy calculation (
In addition to docking power, there is another validation type commonly used in molecular docking, namely screening power assessment, by conducting a retrospective study where a docking protocol is used to screen a database consisting of active and inactive (decoy) compounds (

As shown in Fig.
In this review, we found that 71% of the articles performed their docking screening without validation data (Fig.
MD simulation can overcome some limitations of molecular docking as mentioned previously. The complex of ligand-protein with an incorrect binding pose obtained from docking simulation can be identified by MD simulation where this complex will produce an unstable MD trajectory during the simulation characterized by increasing the ligand-RMSD profile. Moreover, an improvement of enrichment hit in retrospective virtual screening experiments was obtained when the docking poses were rescored using MM/PB(GB)SA after MD simulation (
On the other hand, MD plays a role in generating multiple conformations of the protein targets of interest in addition to multiple experimental structures which can be used for ensemble docking aimed to solve the flexibility of the binding site in molecular docking (
After performing a molecular docking-based screening against a database of ligands, the next important step is post-processing of the docking results to select the virtual hits, which are ideally top-ranked compounds. Generally, 0.1–2.5% of the top-ranked compounds are considered for experimental confirmation (
In docking simulations, it is important to take into account the binding pattern of a ligand to the target protein. Several proteins have been known to have key residues on their active site that can interact with a ligand to produce or improve biological activity. For example, Met769 in the hinge region of EGFR (PDB:1M17) has been reported to be an important residue for ligand inhibitory activity, particularly via the hydrogen bond. The presence of this bond in inhibitors can increase their activity (
In the case of SARS-Cov-2, several targets have no crystallographic data, especially in holo-form. To accommodate this issue, several articles try to dock existing drugs showing clinical benefit in SARS-Cov-2 treatment to the target of interest and take their binding pattern as a reference. This strategy is legal to use as long as the reference drugs have been confirmed to work against the intended target. Surprisingly, we found several studies docked hydroxychloroquine and remdesivir to Mpro, and used their binding pattern for selecting hits. To the best of our knowledge, remdesivir is an RdRp inhibitor (
In molecular docking, there are three types of scoring functions (SF): force field, empirical, and knowledge-based scoring function (
In virtual screening, the consensus docking method can be applied by averaging the score of compounds in each docking tool to generate a new ranking order or by selecting the compounds that are top-scored by all docking tools used (
The top ten compounds on each docking tool, with four compounds (highlighted) found on both ranking lists.
Rank | DOCK score | Vina score | ||
---|---|---|---|---|
1 | ZINC67912533 | -49.206 | ZINC67912780 | -12.1 |
2 | ZINC67912770 | -42.197 | ZINC67912765 | -11.9 |
3 | ZINC67912536 | -41.430 | ZINC67912770 | -11.6 |
4 | ZINC67912780 | -38.277 | ZINC67912773 | -11.4 |
5 | ZINC67912532 | -37.362 | ZINC49823152 | -11.2 |
6 | ZINC72320416 | -36.851 | ZINC28882432 | -11.2 |
7 | ZINC67912525 | -36.756 | ZINC67902892 | -11.1 |
8 | ZINC72320169 | -35.085 | ZINC08234294 | -11 |
9 | ZINC28882432 | -33.335 | ZINC77269187 | -10.8 |
10 | ZINC03643476 | -33.309 | ZINC72320169 | -10.7 |
In the current review, we found that 9.1% of articles used consensus docking in their studies. Interestingly, one article applied a stricter consensus method by taking into account the binding poses of the ligands on the three docking tools: Glide, FRED, and Vina. In the two previous studies mentioned above, it is not clear whether the two docking tools result in a similar binding pose or not. Briefly,
Docking also has a drawback related to the appearance of a biased score caused by the molecular weight (MW) of the ligand since, in most cases, the docking score is directly proportional to the MW (
In virtual screening, ligands with high LE values indicate that they use each atom efficiently to bind to the target and should be prioritized as hits. As pointed out by
Molecular docking provides an effective method for identifying hits in the early stages of drug discovery. So far, docking scores are generally used as the criteria in selecting hits. Unfortunately, the present simplification in molecular docking decreases the accuracy of the docking score thus increasing false positives. In the current review, we have discussed several aspects that may affect the docking performance caused by the inaccuracy of the docking score.
As long as possible, it should be recommended to use the actual crystal structure (X-ray or NMR), especially in holo form, because access to redocking validation and information on ligand binding is only available in holo-structure. For apo and homology as well as holo-structure, the screening power validation can be performed if known inhibitors are available. Fortunately, several studies have reported in vitro activity of their hits against the SARS-Cov-2 target of interest, so it might be used to generate active and decoy sets.
As a novel virus, the information on its target proteins is limited. Relying solely on the docking score to pick hits has the potential to increase false positives, especially without validation or MD confirmation. Therefore, strict virtual hit selection should be applied. There are many strategies available for this purpose. In the current review, we suggested considering ensemble docking, consensus pose and scoring, and ligand efficiency to improve the accuracy of docking results, especially against a novel target.