Molecular docking of secondary metabolites from Indonesian marine and terrestrial organisms targeting SARS-CoV-2 ACE-2, Mpro, and PLpro receptors

With the uncontrolled spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), development and distribution of antiviral drugs and vaccines have gained tremendous importance. This study focused on two viral proteases namely main protease (Mpro) and papain-like protease (PLpro) and human angiotensin-converting enzyme (ACE-2) to identify which of these are essential for viral replication. We screened 102 secondary metabolites against SARS-CoV-2 isolated from 36 terrestrial plants and 36 marine organisms from Indonesian biodiversity. These organisms are typically presumed to have antiviral effects, and some of them have been used as an immunomodulatory activity in traditional medicine. For the molecular docking procedure to obtain Gibbs free energy value (∆G), toxicity, ADME and Lipinski, AutoDock Vina was used. In this study, five secondary metabolites, namely corilagin, dieckol, phlorofucofuroeckol A, proanthocyanidins, and isovitexin, were found to inhibit ACE-2, Mpro, and PLpro receptors in SARS-CoV-2, with a high affinity to the same sites of ptilidepsin, remdesivir, and chloroquine as the control molecules. This study was delimited to molecular docking without any validation by simulations concerned with molecular dynamics. The interactions with two viral proteases and human ACE-2 may play a key role in developing antiviral drugs for five active compounds. In future, we intend to investigate antiviral drugs and the mechanisms of action by in vitro study.


Introduction
The coronavirus disease , caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) or the novel coronavirus, is a highly transmittable and pathogenic viral infection Shereen et al. 2020) and the fifth pandemic after the 1918 flu pande-to 1.52 M by April 1, 2020 in all Indonesian provinces (Tosepu et al. 2020).
Numerous drugs and vaccines have been studied and developed further to fight SARS-CoV-2. Therefore, the secondary metabolites isolated from plants as the source of drug compounds have been examined and stated that they had potential antiviral activity to the group of coronavirus (Orhan and Deniz 2020). It is well-known that Indonesia has mega biodiversity and enormous biological resources in terms of flora, fauna, and macro-and microorganisms in the terrestrial region and ocean (Yap et al. 2019). The terrestrial and marine environments offer new abundant bioactive sources of molecules with unique structural features and various biological activities, such as antiviral, antimicrobial, antifungal, antiparasitic, antioxidant, antitubercular, anti-inflammatory, and antitumor (Terstappen and Reggiani 2001). The production of drug candidates from secondary metabolite sources to produce new bioactive compounds are often characterized by structural novelty, complexity, and diversity (Das et al. 2020;Lan et al. 2020).
It is known that SARS-CoV-2 mainly affects individuals with a weak immune system, and thus, enhancing immunity is one of the best ways to combat SARS-CoV-2. An integrated approach may not only enhance immunity, but also prevent any further complications (Dalan et al. 2020). Numerous researchers worldwide have expressed their interest in developing an effectual vaccine to combat CO-VID-19. Nigella sativa has been used for medicinal purposes for its intense immunoregulatory, anti-inflammatory, and antioxidant benefits in obstructive respiratory disorders. Research has found that Thymoquinone, Nigellidine, and α-hederin from N. sativa can be potential influencers in reinforcing the immune response modulation at the molecular docking level (Hosseini et al. 2021). In a methodical research, docking was observed in gingerol, paradol, shogaol, and zingerone, and thus, they were deduced to be effective active compounds. Two compounds namely 6-gingerol and 10-gingerol formed covalent bonds with the enzyme, indicating that they are the most potent for drug development among the active components of ginger. The chemical structure of all the active components was drawn using an in silico prediction (Ibrahim et al. 2020). Exploring new medicines by investigating a large number of medicinal plants and identifying the active components in the treatment for SARS-CoV-2 can be accomplished through the interaction of compound drugs with the proteins associated with the diseases. An analysis with computer-aided drug design (bioinformatics) can provide meaningful and pertinent information. A study successfully demonstrated Indonesian medicinal plants' use to determine the potential candidate compounds effectual as supportive SARS-CoV-2 therapy. The researchers who conducted the study used molecular docking approach to analyze the interactions between the 3CL pro protein (main protease) with 14 hit compounds from the chemical medicine database (PubChem (https://pubchem.ncbi.nlm.nih. gov/)). The same study affirmed six potential compounds as the main proteases of SARS-CoV-2 inhibitor. More spe-cifically, Hesperidin, Kaempferol-3,4'-di-O-methyl ether (Ermanin), Myricetin-3-glucoside, Peonidine 3-(4'-arabinosylglucoside), Quercetin 3-(2G-rhamnosylrutinoside), and Rhamnetin 3-mannosyl-(1-2)-alloside were demonstrated as potential candidates for antiviral for SARS-CoV-2 (Naidoo et al. 2020).
An in silico approach can reduce the extensive exploration of numerous secondary metabolites to a fewer selected potential chemical compounds and is also efficacious as the initial step of drug discovery research (Terstappen and Reggiani 2001;Yap et al. 2019). Researchers worldwide are attempting to find adequate medication to combat the COVID-19 pandemic. Thus, investigating a potential compound for fighting SARS-CoV-2 by applying in silico antiviral peptides will assuredly be effectual to simplify rigorous experimentations (Fakih et al. 2020). The antiviral approaches consist of inhibiting the synthesis of an RNA virus and replicating the virus, thereby restricting the virus from binding to human cell receptors and obviating its self-forming process. Research had shown that when SARS-CoV-2 infected humans, the receptor-binding domain of the virus's spike protein was attached to angiotensin-converting enzyme (ACE-2) as the primary cell receptor (Das et al. 2020;Lan et al. 2020). ACE-2 is in charge of the renin-angiotensin system's major role to regulate blood pressure, fluid, and electrolyte balance. A previous study involving in vivo assay concluded that respiratory tract epithelia infected by SARS-CoV-2 were pertinent to the state of cell differentiation and ACE-2 expression. The infection is more likely to develop during sufficiently differentiated ciliated cells of increasing ACE-2 expression. The same research further showed that the expression of ACE-2 membrane and plasma level was decreased and the amplification of pulmonary wound after being infected by SARS-CoV-2 (Dalan et al. 2020). SARS-CoV-2 genomes encode 16 non-structural proteins. Two of them are the main protease (M pro ) and papain-like protease (PL pro ), responsible for replicating protein's proteolytic processing, likewise eliminating viral replication (Ibrahim et al. 2020;Hosseini et al. 2021). The essential roles of ACE-2, M pro , and PL pro have rendered them as potential targets for discovering effectual medication against SARS-CoV-2 using molecular docking analysis (Naidoo et al. 2020). Numerous docking research studies have been conducted using M pro , PL pro , and ACE-2 receptors to discover a natural compound to fight against SARS-CoV-2. Flavonoids such as rutin, isorhamnetin-3-O-β-D, and calendoflaside from Calendula officinalis have exhibited a superlative binding energy to main protease of SARS-CoV-2 than the native ligand (N3) (Joshi et al. 2020). Novel lichen compounds have also been investigated and have manifested a strong affinity toward M pro and thus, can be established to be effective antiviral against SARS-CoV-2 1 (Joshi et al. 2020) 7 . Research has validated the combination of PL pro and M pro as an effective receptor in docking research to discover potential antiviral compounds from ebselene derivatives and cyanobacterial metabolites (Naidoo et al. 2020;Zmudzinski et al. 2020). Virtual screening focusing on ACE-2 as the viral entry has been rigorously studied for several compounds and gave selected compounds to confirm an in vitro assay (Benítez-Cardoza and Vique-Sánchez 2020).
Molecular docking is used as a virtual screening to determine potential inhibitor candidates at several critical receptors in controlling the replication of SARS-CoV-2. Besides, pharmacochemical analyses, such as Absorption, Distribution, Metabolism and Excretion (ADME) analysis and toxicity prediction, are also used to complement molecular docking analysis recommendations. This research uses AutoDock Vina with a Lamarckian genetic algorithm. This software can optimize the interaction determination of potential candidate compounds. The ongoing COVID pandemic has necessitated an extensive research on the selection of the candidates found in active compounds of Indonesian biodiversity, both from organisms on land and sea with various bioactivity spectrums, that can effectively inhibit SARS-CoV-2 replication. This study strategically selected 102 potentially active compounds with target tethering to the ACE-2, PM pro , and PL pro receptors. The potential was appraised in terms of the Gibbs free energy generated in the interaction, orientation, conformation, ADME analysis, and the associated toxicity analysis. This study is crucial as a recommendation for potential active compounds in identifying the effectual inhibitors of SARS-CoV-2 replication.

Software and hardware
The whole research is computational using software and hardware involved in compound preparation, molecular docking, and analysis. We used the PubChem database for the compound preparation step for ligand molecules and Protein Data Bank (PDB) database for proteins. The preparation stage also removes water molecules, adds hydrogen atoms, and determines the receptor's ligand docking location using the AutoDockTools (ADT) software-1.5.6. Tethering molecules was done using AutoDock Vina, and the results were visualized using Discovery Studio 2020 and PyMol (Trott et al. 2010).

Ligand preparation
All chemical structures of natural materials from marine and terrestrial were obtained from PubChem (PubChem (nih.gov)) based on a thorough literature review. This study used several control compounds in the form of active compounds that were reported to have been used in the treatment of SARS-CoV-2 (Table 1).
MarvinSketch 20.12 was used to optimize each ligand's 2D and 3D structures and the conversion from *mol files to *PDB and *PDBQT. PDBQT file types were generally used for molecular docking simulations. The ligands used were also analyzed for toxicity and solubility using the Lipinski Rule (Lipinski et al. 2004), toxicity analysis using ADME (Cheng et al. 2012).

Ligand-receptor docking
Preparation of tethering between ligands and receptors was done using ADT 1.5.6 and AutoDock Vina 4.2 programs. The grid box's determination is one of the parameters used as the ligand docking region at the receptor. This study used 105 ligand-binding poses to the receptor by applying the Lamarckian Genetic Algorithm (LGA) to explore the ligand conformation (Morris et al. 1998). Molecular docking can assist in virtual screening by producing binding affinity energy (∆G). The ligand with the maximum potential has a more negative value of ∆G. Discovery Studio was used to visualize the molecular docking between ligands and

Results and discussion
The active compounds of most of the Indonesian marine and terrestrial organisms used in this study have potential pharmaceutical applications. Table 1 encapsulates different classes of compounds that have been discovered, such as terpenoids, lipids, peptides, alkaloids, fatty acids, lactones, dioxins, xanthone, flavonoid, and phenols. Both terrestrial and marine organisms can produce immunomodulator and antiviral activities (i.e., against HCV, HBV, HSV, HIV, TMV, ISA, SINV, HCMC, JEV, and EV-71). The usefulness of species of Indonesian terrestrial organisms in this study was based on a thorough literature study. Most researchers believe these plants can enrich health and prevent diseases caused by the COVID-19 pandemic even though most of these plants have not been validated as effectual against COVID-19 heretofore. We use marine bacteria, fungi, microalgae, marine plants, and marine invertebrate for marine compounds.
Several studies have affirmed that terrestrial and marine organisms produce various compounds derived from secondary metabolism, requiring antiviral activity. More particularly, certain terrestrial and marine metabolites are active against some viruses. Besides, various mechanisms and different targets of action have been discovered. Mechanisms of action of possible antiviral compounds are diversified because they can block viruses at different stages of their life cycles. Typical viral life cycle stages are attachment, penetration, uncoating, replication, assembly, and release (Ibrahim et al. 2020). We aimed to procure information through molecular docking regarding plants and marine compounds that have the antiviral activity to be used against SARS-CoV-2.
As a comparative study, we also used common drugs, namely Remdesivir, Chloroquine, and Favipiravir, which have been used as common drugs for specific diseases and can be used to expedite the treatment process of SARS-CoV-2. For instance, Favipiravir is one such oral drug approved for a new and resurfacing influenza pandemic in Japan in 2014 and has exhibited potent in vitro activity against SARS-CoV-2 (Hosseini et al. 2021). We also use Plitidepsin, originally from Ascidian, approved for drug marketing and reported to have potent preclinical efficacy against SARS-CoV-2 by targeting the host protein eEF1A (Lan et al. 2020).
The analysis on the structure-activity relationship understanding is essential for further study; for example, analysis on the structure-activity relationship revealed that the hydroquinone moiety and the double bonds at carbon numbers-5 and -9 in metachromatic A are crucial for anti-HBV activity (Sherren et al. 2020). Another example MGDG could cause complete lysis of the viral envelope, which is essential for viral attachment to host cells. This phenomenon likely explains the virucidal action of MGDG ).

Receptor structure and stability
SARS-CoV-2 M pro receptor with a PDB ID of 6LU7 has a resolution of 2.16 Å with 306 amino acids. The M pro SARS- In the β-sheet structure of M pro protein, it was known that the β1-β7 position is an antiparallel β-sheet structure, while the β8-β1-13 position is a β-sheet structure with both parallel and antiparallel β-sheet structures. The remaining areas are loop or turn areas composed of amino acids connecting the alpha and beta structures.   The M pro receptor has been meticulously analyzed for its structure and stability using the Ramachandran's plot. Stable structures have a resolution of less than 2.5 Å with amino acid residues in the protein-forming region above 85% (Trott et al. 2010). The M pro SARS-CoV-2 receptor has a resolution below 2.5 Å with the amino acid residue scattered in the region that makes up the protein structure is 92.15% (282/306). The M pro receptor has 14 residues that interact with ligands, namely Thr25, Thr25, Thr26, His41, Phe140, Leu141, Asn142, Gly143, Ser144, Cys145, His163, His164, Glu166, and Gln189. The grid box dimensions (x, y, z) (Å) are (50, 50, 50), with centers (-10.850, -15.320, 68.390) (Daczkowski et al. 2017).

Solubility and Lipinski analysis on ligands
All ligands that have been optimized for the structure were then analyzed for solubility and permeability. The ligand structure's psychochemical properties can describe the solubility and permeability analysis of a drug candidate. It has been reported that there are five Lipinski's rules that can predict the solubility and permeability of a ligand, which are: 1) a ligand should not have five hydrogen bond donors, 2) a ligand should not have more than 10 hydrogen bond acceptors, 3) a ligand should not have a molecular weight more than 500 Da, 4) a ligand should not have log P value less than five, 5) a ligand should not have polar surface area (PSA) more than 140 A with ligand rotational bonds less than 10 30. This study used four parameters: molecular weight, donor H, acceptor H, and log P.
With the psychochemical analysis of the ligands used as SARS-CoV-2 inhibitor candidates, it was found that the molecular weights of the ligands analyzed ranged between 100 and 1,200 Da. The smallest molecular weight was 136.23 Da, and the largest 1526.45. Also, the log P value ranged between -16 and 18. The hydrogen bond donor value was between 0 and 29, with the hydrogen bond acceptor value between 0 and 47. Based on the ligand analysis of Lipinski's rule, it was found that 17 ligands were not included in Lipinski's rule, i.e., Methyl 3, 5-di-O-caffeoyl quinate, Proanthocyanidins, Rutin, Corilagin, cyanidin 3-O-sambubioside), delphinidin-3-O-sambubioside, ptilidepsin, remdesivir, Antimycin A, Cadiolide B, Chitosan, Dieckol, EPS (Exopolysaccharide), Mollamide F, Phlorofucofuroeckol A, Prunolide A, Thalassiolin D, and Thalassodendrone.
The ADME data and Lipinski's rule demonstrated that among the top five metabolite compounds obtained based on the Gibbs free energy (∆G), namely corilagin, dieckol, phlorofucofuroeckol A, proanthocyanidins A, isovitexin, only isovitexin meets Lipinski's rules. Lipinski's rule also does not permit the control of Remdesivir and Ptilidepsin, the two compounds that have been used in the treatment of COVID-19. This is interesting as the absence of corilagin, dieckol, phlorofucofuroeckol A, and proanthocyanidins A is related to molecular sizes larger than 500 Da. It was known that corilagin, dieckol, phlorofucofuroeckol A, proanthocyanidins, remdesivir, and ptilidepsin have molecular weights 634.45 Da, 742.55 Da, 602.46 Da, 592.22 Da, 1110.34, and 602.58 Da, respectively. Based on Lipinski's analysis, molecular weight < 500 Da is more accessible to mass produce, more stable, easy to apply as an oral drug. Based on this analysis, metabolite compounds can be applied in medicine, such as COVID-19 with further observations on drug delivery and other pharmacological conditions.

Toxicity analysis with admetSAR
Ligand toxicity analysis aims to determine the level of toxicity of a ligand using the admetSAR site. This method was used based on the activity-structure relationship used to predict the pharmacokinetic level and toxicity of a ligand. In this study, carcinogenicity (binary and trinary), hepatotoxicity, and acute oral toxicity based on class and ligand concentration were used. The carcinogenicity assessment was based on data on the Carcinogenic Potency Database (CPDB), where there are 1,547 chemical structures with tumor data on rodents that have carcinogenic potential as seen from the TD50 value. The screening method for carcinogenic compounds uses the Morgan fingerprint and the k-nearest neighbors (kNN) method. Meanwhile, the hepatotoxicity assessment was based on the DrugBank database with 3,115 toxic compounds and 593 non-toxic compounds. The ligands were prepared using Pipeline Pilot, with the inorganic compounds, large molecular compounds (> 800 Da), and inorganic salts in the mixture removed. The acute oral toxicity assessment was based on a database wherein there were 10,207 compounds with LD50 in a mouse model 24 . Carcinogenicity analysis of compounds using admet-SAR was classified into two models, binary and trinary.
The development of these models from admetSAR to predict the carcinogenicity of a compound was accomplished using five machine learning methods, namely support vector machine (SVM), kNN, random forest (RF), C4.5 decision tree (DT), and naïve Bayes (NB), combined with six types of fingerprints. However, the best binary and trinary models were constructed using the kNN and SVM algorithms with the MACCS fingerprint. The binary classification aims to distinguish chemical compounds with various structures into carcinogenic active and inactive (Lipinski et al. 2004). In contrast, the trinary classification, the three-class classification, aims to predict a chemical compound's carcinogenic potency as non-required, warning, and danger based on the median toxic dose (TD50), the dose required to cause a toxic effect in 50% of the population (Cheng et al. 2012). If a compound was categorized as non-required, it indicates that the compound is non-carcinogenic; a warning means that the compound is carcinogenic with a TD50 > 10 mg/kg body weight/ day, while danger implies a carcinogenic compound with a TD50 ≥ 10 mg/kg body weight/day. In general, the results of the carcinogenicity analysis of compounds using admetSAR with binary model revealed that the majority of compounds are non-carcinogenic, only one out of 105 compounds indicated to be carcinogenic, there is cinna-  In the carcinogenicity analysis results using the binary model, the non-carcinogenic compounds were marked with a negative sign (-) with a range of probability or accuracy values between 0.5075 and 1.
In contrast, the compounds predicted to be carcinogenic have a positive sign (+) with a probability value of 0.5714. Besides the carcinogenicity analysis, hepatotoxicity analysis is vital in drug discovery and development efforts because liver toxicity is at the top of drug reduction. The hepatotoxicity analysis was performed using admetSAR to classify the compounds into active/ inactive or positive/negative (Morris et al. 1998). The analysis revealed that 56 out of the 105 compounds were predicted to be hepatotoxic agents with probability values between 0.525 and 0.925. Meanwhile, the other 49 compounds were predicted to be non-hepatotoxic with their probability values between 0.5 and 0.9.
The acute oral toxicity analysis for classifying compounds into four categories based on the value of 50% lethal dose (LD50), generally expressed in terms of the amount of material per unit of body weight (Lu et al. 2009), a dose or concentration of a substance/compound estimated to have caused the death of half of the individuals who receive it. The categorization of toxicity refers to the United States Environmental Protection Agency (U.S. EPA) that classifies chemicals into four categories with different toxicity levels. The four categories of acute oral toxicity are category I (danger/poison), II (warning), III (caution), and IV (non-required). The compounds falling under category I are those with LD50 ≤ 50 mg/kg, under category II are compounds with LD50 > 50 mg/kg and ≤ 500 mg/kg, under category III are compounds with LD50 > 500 mg/kg and ≤ 5,000 mg/kg, and under category IV are the compound with LD50 value > 5,000 mg/kg. In other words, compounds with category I have the highest acute oral toxicity among other categories (Lipinski et al. 2004;Ho et al. 2005). The selected 105 compounds belonged to various acute oral toxicity categories: category I (five compounds), category II (nine compounds), category III (87 compounds), and category IV (four compounds).
The docking results show that there are five best compounds for potential drugs from the total 105 compounds: isovitexin, proanthocyanidins, corilagin, and dieckol phlorofucofuroeckol A. The toxicity analysis results reveal that these five compounds are non-carcinogenic, even though they can be hepatotoxic agents with a value of < 75%. These five compounds' acute oral toxicity also indicated that they were classified as category I (isovitexin) and category III (proanthocyanidins, corilagin, dieckol, phlorofucofuroeckol A compounds).

Gibbs free energy analysis
Based on the Gibbs free energy analysis, the more negative the value resulting from molecular docking simulation, the more stable the bond between ligands and receptors. Therefore, the ligands with a stable negative energy value against the three receptors ACE-2, M pro , and PL pro will be discussed. The ligands to be discussed are those with the best five Gibbs free energy values (∆G): corilagin, dieckol, phlorofucofuroeckol A, proanthocyanidins, isovitexin, with ptilidepsin and remdesivir as controls (common drugs) used as comparators (Table 3 and Suppl. material 1: Figure S1). Based on the analysis of interactions between ligands and receptors, it can be deduced that several bonds were formed between the two, such as hydrophobic bonds, van der Waals bonds, and hydrogen bonds.
The analysis of hydrogen bonds revealed that ACE-2 receptors have interactions with metabolite compounds   with several amino acid residues, such as Gln 442, Arg 518, Ser 409, Lys 441, Arg 518, and Asp 350. As for amino acid residues Leu 287, Asn 238, Thr 199, and Leu 272, Asn 238 mostly interacted with metabolite compounds on M pro receptors. Meanwhile, amino acid residues Thr 75, Leu 76, Asp 77, Gln 175, and Asn 157 are the most residues that interact with metabolite compounds (Table 4). The longest hydrogen bond was detected for Leu 287 at the M pro receptor, with a length of 3.67 Å, and the shortest for Thr 169 at the same receptor, with a length 2.70 Å. Meanwhile, the conformation, interaction, and orientation of dieckol and ptylidepsin were illustrated in Figure 4. Dieckol is in the receptor-binding site region, allowing dieckol to have the potential as an inhibitor between the receptor and SARS-CoV-2 spike protein, similar for the ptylidepsin control.

Conclusion
Through molecular docking, toxicity, ADME, and Lipinski, virtual screening has been successfully performed on 102 secondary metabolites compounds against three important receptors in SARS CoV-2; ACE-2, M pro , PL pro . The analysis obtained that corilagin, dieckol, phlorofucofuroeckol A, proanthocyanidins, and isovitexin can inhibit all three receptors. The study of hydrogen bonds revealed that ACE-2 receptors have interactions with metabolite compounds with several amino acid residues, such as Gln 442, Arg 518, Ser 409, Lys 441, Arg 518, and Asp 350. As for amino acid residues Leu 287, Asn 238, Thr 199, and Leu 272, Asn 238 mostly interacted with metabolite compounds on M pro receptors. Meanwhile, amino acid residues Thr 75, Leu 76, Asp 77, Gln 175, and Asn 157 are the most residues that interact with metabolite in PL pro . These five compounds have conformation and orientation in the binding site receptors. We recommend the five compounds be developed as drug candidates in the treatment to decrease the growth of the SARS CoV-2 virus. Molecular docking results can be used as a scientific basis in selecting drug candidates before being done in the laboratory.