Molecular docking screening, dynamics simulations, ADMET, and semi-synthesis prediction of flavones and flavonols from the COCONUT database as potent bifunctional neuraminidase inhibitors

prediction of


Introduction
Influenza is one of the most terrifying widespread diseases, causing so many fatalities and hospitalizations over the world's history.This disease is an acute respiratory infection caused by influenza viruses (Peiris et al. 2012).The virus genetic diversity due to their high mutation rates leads to difficulties in the treatment.Among them, type A influenza is the most dangerous and infectious, with about a billion patients every year, and may lead to flu pandemics (Hussain et al. 2017).Consequently, there is a critical need for anti-influenza therapy that can treat newly emerging viruses or mutating strains through novel targets and action mechanisms (Barik 2012).
Influenza A viruses are classified into different variants based on the two antigenic proteins on their surface: hemagglutinin (H) and neuraminidase (N).H protein binds to the terminal sialic acid residues of oligosaccharide receptors on epithelial cells for the penetration of the virus into the cell cytoplasm.Neuraminidase then cleaves sialic acid to release the new virions off the cell surface (Bouvier and Palese 2008).The present medicines used for treating influenza A viruses mainly focus on the membrane glycoproteins (H and N proteins) and matrix-2 protein, which are important targets for viral infections (Loregian et al. 2014).Among these targets, the N protein has become an excellent target for preventing influenza infections (Jagadesh et al. 2016;Gubareva and Mohan 2022).Several N inhibitors (oseltamivir, laninamivir, peramivir, and zanamivir) have been used for the clinical treating influenza (Shobugawa et al. 2012).Still, the mutation rates of the N protein frequently emerge, leading to the appearance of drug-resistant strains (McKimm-Breschkin 2013;Chen et al. 2018).An influenza mutant, H274Y, a tyrosine mutation near oseltamivir's hydrophobic binding site, reduced susceptibility by three orders of magnitude loss of oseltamivir efficacy (Collins et al. 2008;Dharan et al. 2009;Thorlund et al. 2011;Woods et al. 2013).Additionally, a previous report suggested the potential advantage of new bifunctional N inhibitors, which had interacted with both the sialic acid binding site and the adjacent 430-cavity position (Evteev et al. 2021).Hence, there is a significant demand for the discovery of new bifunctional N inhibitors in order to improve the efficacy anti-influenza treatment in clinics.
Flavonoid drugs are a typical class of pharmacological compounds that have significantly contributed to both fundamental and medical research (Di Carlo et al. 1999).Since their efficacy and safety, flavonoid drugs have been used as potential candidates in comparison with other class drugs (Bisol et al. 2020;Alzaabi et al. 2021).Several flavonoid drugs, such as rutin and quercetin (cardiovascular chronic treatment), hesperidin (neurodegenerative diseases), and Ginkgo biloba flavonoids (antioxidants and cardiovascular) have been approved for use in clinics, demonstrating remarkable efficacy as therapeutics across multiple disciplines and biocompatibility (Lauro et al. 2002;Hajialyani et al. 2019;Li et al. 2023).Consequently, there is a rise in interest in developing flavonoid drugs, especially for flavones and flavonols, because the C2-C3 double bond is essential for biological activities (Zhao et al. 2019).These skeletons have multiple functional moieties, such as double bonds, hydroxy groups, and aromatic rings, which make it easy to modify and design to enhance their biological application and active pharmaceutical synthesis.However, a recent study indicated that natural flavonoid glycosides are poorly absorbed when using oral drugs, and they need to be de-glycosylated (Ma et al. 2014).Therefore, investigating the role of flavone/flavonol derivatives from natural sources and their synthetic products as potential N inhibitors is necessary.
The COlleCtion of Open Natural ProdUcTs (COCO-NUT) database is a free and open server without login for usage.This online database regroups all known compounds from natural products, which helps researchers easily search or perform in silico screening and various computational manipulations and applications (Sorokina et al. 2021).Many studies have used the COCONUT database for screening bioactivities, especially using flavonoids from this database (Yang et al. 2022;Lee et al. 2023;Yuan et al. 2023).Recently, several studies have revealed the role of flavonoids as potential N inhibitors (Rakers et al. 2014;Sadati et al. 2019).However, there have been fewer reports about molecular docking screening and molecular dynamics to discover new bifunctional N inhibitors of this class structure from the COCONUT database.
The current study aims to conduct in silico screening of more than 3,000 flavones and flavonols and propose the simplest procedure for the semi-synthesis of the best candidate based on absorption, distribution, metabolism, excretion, and toxicity (ADMET) predictions.The results of our research will help scientists develop strategies to semi-synthesize from natural flavonoids and increase flavonoids' bioavailability so that we can get the full benefits of flavonoids in our treatment therapy.

Ligand preparation
More than 3,000 three-dimensional (3D) structures of flavones and flavonols were downloaded from the COCONUT database website (https://coconut.naturalproducts.net/).The 3D structures of oseltamivir and laninamivir were drawn using Chem3D version 16.0 software.After that, these ligands were minimized energy using the molecular mechanics-2 (MM2) method until the root mean square (RMS) gradient value became smaller than 0.001 kcal/mol Å (El Aissouq et al. 2020).The compounds were then converted from file *.sdf to file *.pdb using OpenBabel version 2.4.1 software.The list of 2,564 flavones and 660 flavonols was provided in the Suppl material 1.

Protein preparation
The 3D structure of the N1-H274Y-oseltamivir complex (PDB ID: 3CL0) was retrieved from the Protein Data Bank (http://www.pdb.org)(Collins et al. 2008;Vavricka et al. 2011;Ha et al. 2021).This structure was removed water molecules, undesirable chains, or residues using Discovery Studio Visualizer version 2.5 software and optimized using Chimera software with 500 steps of steepest descent and 100 steps of the conjugate gradient with the AMBER ff14SB force field parameters.The grid of the catalytic active site of the protein had the following location: center (X: -27.8, Y: -53.9, Z: 8.5) and dimension (Å) (X: 18.6, Y: 18.1, Z: 17.1).

Molecular docking study
The binding affinity predictions between the ligands (flavones, flavonols, oseltamivir, or laninamivir) and N protein 3CL0 were carried out using AutoDock Vina parameters assisted with PyRx software (Trott and Olson 2010).In order to design the first screening step, 2,564 flavones, and 660 flavonols were docked randomly into the entire protein structure.After this step, the potential candidates were selected when their binding affinity predictions were equal to or lower than the binding energy of positive control (laninamivir).Then, the molecular docking between the potential candidates (659 flavones and 112 flavonols) and 3CL0 protein at the catalytic active site only was performed.After performing two steps of molecular docking screening procedures, the candidates with higher binding affinity to the active site (with several key amino acid residues Arg118, Asp151, Arg152, Arg224, Glu276, Arg292, and Arg371) were mainly focused as specific target candidates.The results of the interactions were visualized using Discovery Studio Visualizer version 2.5 and Chimera 1.16 software.

Molecular dynamics simulations
The predicted potential compounds from the molecular docking study were then performed in the molecular dynamics simulations.Firstly, the 2D structures of these candidates were drawn using the ChemDraw version 16 software, and then the 3D structures were designed and optimized using Gaussian 06 software based on the semi-empirical MP6 method (Yilmazer and Korth 2013).After that, the structures were converted to file *.pdb format using OpenBabel version 2.4.1 software.Next, these ligand-protein complexes were evaluated in molecular dynamics using NAnoscale Molecular Dynamics (NAMD) software (Phillips et al. 2005).Using the CHARMM-GUI service (CHARMM36m force field), topologies and parameter files of ligands and proteins were parameterized.The systems were solvated with TIP3P molecules of water and neutralized with 0.15 M sodium chloride ions.After performing the energy minimization algorithm, the systems were equilibrated for 100 ps.These systems were then simulated at a constant temperature of 310 K for 500 ns using the NPT ensemble.The backbone root mean square deviation (RMSD) and the number of hydrogen bonds (HBs) of these complexes were analyzed using Visual Molecular Dynamics (VMD) version 1.93 software (Humphrey et al. 1996).The molecular docking screening and dynamics simulations were successfully carried out using the hardware as follows: operating system: Windows 10; memory: 16.0 GB RAM, processor: 13 th Gen Intel Core i5-13400F 2.50 GHz, and GPU: NVIDIA Geforce RTX 3060.

Prediction of absorption, distribution, metabolism, excretion, and toxicity
Most potential candidates from the second step were predicted the absorption, distribution, metabolism, excretion, and toxicity (ADMET) values using pkCSM online server (https://biosig.lab.uq.edu.au/pkcsm/prediction) and SwissADME web tool (http://www.swissadme.ch).The tested compounds were converted to canonical SMILES files and uploaded to the pkCSM and SwissAD-ME website.The web server tools were suggested for the predicted information, and the ADMET values were chosen from these databases.

Proposed the semi-synthesis
The most potential candidate was proposed for the semi-synthesis strategy using the natural source (propolis extract) and other common reagents such as ethyl piperazine-1-carboxylate, ClCH 2 COCl, K 2 CO 3 , K 2 CO 3 , KI, dichloromethane (DCM), and acetone that assisted by simple methods (column chromatography and reflux).The semi-synthesis strategy was drawn using the ChemDraw version 16.0 software.

Molecular docking screening
To evaluate the reliability of the molecular docking screening method using AutoDock Vina parameters assisted with PyRx software, two positive controls were first carried out for the molecular docking into both the entire structure and the catalytic active site of N-resistant oseltamivir protein (PDB ID: 3CL0).The binding energy of oseltamivir for the entire and catalytic active site exhibited a value of -6.7 kcal/mol (Suppl.material 1: table S1 and fig.S1), with several key interactions as Glu119, Ile222, Arg292, and Tyr406.In contrast, laninamivir showed robust binding energy (-8.4 kcal/mol for the entire structure and -8.1 kcal/mol for the catalytic active site) in the complex with this protein.Some key interactions between laninamivir and the catalytic active site were obtained, such as Arg152, Trp178, Glu227, Glu277, Arg292, Asn294, Tyr347, and Arg371.In comparison with previous studies, our results presented the binding interactions with several key amino acid residues similar to those of different methods for molecular docking (AMBER10 software or Discovery Studio 4.0/CDOCKER) (Meeprasert et al. 2012;Ha et al. 2020).These results confirmed that the N1-H274Y-oseltamivir protein (PDB ID: 3CL0) had low affinity to oseltamivir while strongly bound by laninamivir.From this evidence, if compounds could exhibit binding energies lower than -8.1 kcal/mol, they could become potential candidates for further studies (Fig. 1).The current research performed the molecular docking screening for 2,564 flavones and 660 flavonols from the COCONUT database.After completing the first step of molecular docking screening, only the remaining 659 flavones and 112 flavonols exhibited binding energies equal to or lower than -8.1 kcal/mol (Ki value £ 1.145 µM) (Suppl.material 1: tables S2, S3).
The second step of molecular docking screening was carried out for more than 700 candidates (659 flavones and 112 flavonols) with N1-H274Y-oseltamivir protein (PDB ID: 3CL0) at the catalytic active site only.After completing this screening step, only the remaining 145 flavones and 18 flavonols exhibited binding energies equal to or lower than -9.5 kcal/mol (Ki value £ 0.107 µM, which is ten times stronger than that of laninamivir) (Suppl.material 1: tables S4, S5) were chosen for further analysis.
Suppl.material 1: table S4 and fig.S2 illustrate the binding interactions with amino acid residues, the number and length of HBs.Several important amino acid residues (Arg118, Asp151, Arg292, and Arg371) commonly appeared when 145 flavones and 18 flavonols interacted with the catalytic active site through different types of interactions (conventional hydrogen bond, carbon-hydrogen bond, π-cation/anion, π-π stacked,…) Among 145 compounds from flavone skeletons, several potential structures, including 428, 581, 864, and 948, were selected for detailed analysis based on the binding energies, number and length of HBs, molecular weight, and hypothetical bifunctional N inhibitors.All four compounds contain a nitrogen-containing heterocyclic group attached to the flavone aglycone at the C7 position (compounds 428, 581, and 864) and C8 position (compound 948), greatly contributing to the creating of anti-N biological activity of flavones.Based on the docking results in Table 1 and Fig. 2, the attachment of the hydroxy group at the C5 position could create at least two HBs with the target protein, but compound 948 does not have to the hydroxy group, leading to the lack of strong interaction bonds between the flavone aglycone and protein.Besides nitrogen-containing heterocycles and the hydroxy group, the linker groups (such as ester, ether, urethane, or amide) also play an important role in creating a bridge between the heterocycles and the flavone aglycone.Several key interactions with  the triad of arginine residues Arg118-Arg292-Arg371 of N protein with compounds (428, 581, 864, and 948) revealed that these linker groups in bifunctional inhibitors were necessary for connecting different structural fragments to inhibit the N protein at 2 located positions [the sialic acid binding cavity (key amino acid residues: Asp151, Arg152, Trp178, Ile222, Glu227, and Glu276) and the adjacent 430-cavity (key amino acid residues: Pro326, Ile427, Thr439)] (Fig. 3) (Evteev et al. 2021).The important HBs interactions of flavone 428 with the N protein at 2 located positions were recorded as eight HBs (Val149, Asp151, Trp178, Glu227, Arg292, Tyr347, and Thr439), with the length of these bonds from 1.9 to 3.1 Å, while compound 581 showed five HBs (Arg118, Val149, Asp151, and Arg152) and the range of these HBs length 2.0-2.6 Å.Similarly, two flavones (864 and 948) also exhibited five HBs with several key amino acid residues (Arg118, Val149, Asp151, Arg292 and Trp178, Glu277, Arg292, Arg371, respectively) with the length of HBs lower than 3.0 Å.Because of strong interaction with the sialic acid binding site and the adjacent 430-cavity through interaction with a unique arginine triad of linker groups, these compounds were predicted to become suitable bifunctional N inhibitors against influenza virus strains during the occurrence of the mutations in the future.
Among 18 flavonols selected (Suppl.material 1: table S5 and fig.S3), two potential structures -162 and 218, were chosen for further analysis based on several factors (binding energies, molecular weight, number and length of HBs, and potential as bifunctional N inhibitors).Those compounds do not have the nitrogen heterocycle attachment.Compound 162 is attached to aromatic rings, while compound 218 has a carboxylic acid group and dioxane moieties linked to the B ring of the flavonol skeleton.When analyzing the interaction of these candidates with the target protein, compound 162 showed seven HBs (Arg118, Glu276, Arg292, and Arg371) with lengths ranging from 2.3 to 3.3 Å, while compound 218 exhibited only four HBs with three key amino acid residues (Tyr347, Arg371, and Ser404) with the length from 2.7 to 2.9 Å.Additionally, compound 162 showed good interactions with the unique arginine triad Arg118-Arg292-Arg371 of N protein, while compound 218 did not show binding to the arginine triad.These results demonstrated that with flavonoids containing only aglycone without the linker groups (nitrogen heterocycle or aromatic rings), they were difficult to create a bifunctional binding with the sialic acid binding cavity and the adjacent 430-cavity positions through the unique arginine triad Arg118-Arg292-Arg371.This study is the first report about the role of the linker groups (such as ester, ether, urethane, or amide) connecting nitrogen heterocycles and aromatic rings with flavones and flavonols, respectively, has potential for inhibitory sialic acid binding cavity and the adjacent 430-cavity of N protein from influenza virus strains.

Molecular dynamics simulations
Based on the above molecular docking screening results, we predicted that flavonoids (162, 428, 581, 864, and 948) have better bifunctional N inhibitory activity than flavonoid aglycone only (218).Before performing the molecular dynamics simulations, these candidates were analyzed as stable at low pH or not.Previous studies suggested that the N protein was generally stable in the wide pH range of 4.0-7.0,and this protein of pandemic viruses was especially stable at low pH (4.0-5.0)(Takahashi and Suzuki 2015;McAuley et al. 2019;Klenow et al. 2023).Five potential bifunctional N inhibitors [four flavones (428,581,864,948) and a flavonol (162)] were stable with a low pH range of 4.0-6.5 using the Avagadro software, except compound 218 was stable at pH lower than 4.These results predicted that these stable compounds at low pH could be useful for treating virus infections in a pandemic.Hence, among potent bifunctional N inhibitors, the three potential candidates (428, 581, and 864) were chosen for further molecular dynamic simulations due to the nitrogen heterocyclic substituents at position C7 on the flavone aglycone.Those three compounds exhibited RMSD values within a limit of less than 2Å, which was similar to that of the target protein only (Fig. 4).So, the preferred crystallographic binding orientation was a good solution (Ramírez and Caballero 2018).
HBs play an important role in the stabilization of N protein and ligand complexes (Williams and Ladbury 2003).In this study, the percentages of HBs occupancy during 500 ns of the molecular dynamics simulation step were also evaluated.As shown in Table 2, the highest percentage of HBs occupancy between compound 428 and N protein was obtained for Glu227 and Asp151 amino acids; meanwhile, for compounds 581 and 864, it was Asp151 amino acid.In particular, compound 864 showed the RMSD values fluctuating around 1.5Å, the number of HBs from 3-4 bonds in the range from 0 to 230 ns, and gradually decreased after the period of 230-500 ns.The percentages of HBs' occupancy appeared for the triad of arginine residues (Arg118-Arg292-Arg371).This result suggested that compound 864 could have better interactions with the sialic acid binding cavity and the adjacent 430-cavity positions than other candidates (Kenny et al. 2016).

ADMET study
After performing the molecular docking screening, dynamics simulations, and predicting the partition coefficient (LogP) values (Table 1), compound 864 was suggested as the most potential candidate (Landry and Crawford 2019).This structure then evaluated the "drug-likeness" properties using Swiss ADME and pkCSM web service.As shown in Table 3, several parameters of compound 864 followed Lipinski's rule, such as molecular weight less than 500 Da, logP value from 0 to 3, and number of HBs donors and acceptors less than 5 and 10, respectively.Besides, the topological polar surface area (TPSA), also known as a good predictor for drug permeability into the cell membranes, was less than 140 Å, suggesting that compound 864 could become the best anti-N candidate in our research (Doak et al. 2014).
ADMET predictions of chemicals have significant roles in drug design and development (Guan et al. 2019).The ADMET properties of compound 864 were then analyzed based on several indicators, including aqueous solubility (ranges from -6.5 to 0.5); human intestinal absorption; blood-brain barrier permeability (logBB > 0.3 considered to cross the blood-brain barrier); P-glycoprotein substrate; total clearance; bioavailability score (from 0.85 if PSA less than 75 Å, to 0.56 if PSA from 75 to 150 Å, to 0.11 if PSA greater than 150 Å) (Martin 2005); AMES toxicity; max tolerated dose (low dose if less than or equal to 0.477, high dose if greater than 0.477); hERG II inhibitor; oral rat acute and chronic toxicity; skin sensitization; Tetrahymena pyriformis toxicity; minnow toxicity; and Lipinski's rule violations (Table 3).These data suggested that compound 864 could be absorbed into the intestinal,    uncross the blood-brain barrier, cytochrome P450 inhibitors, moderate total clearance, and non-AMES toxicity.

Proposed the semi-synthesis
Flavones are known as an essential core structure for drug design (Boniface and Elizabeth 2019).This study is the first report of the proposed semi-synthesis of compound 864, an N inhibitor, from natural sources.A previous study suggested that the propolis 70% ethanol extract contained chrysin about 18.64 mg.g −1 (Stompor-Gorący et al. 2021).A huge amount of chrysin could be easily isolated from the propolis, so this natural source was suggested for starting materials in the design strategy for target compound 864.Based on the previous report about the synthesis scheme for other targets but having a similar design strategy (Wang et al. 2021), our study suggested a design scheme for compound 864 in Fig. 5. Chrysin was isolated from the propolis 70% ethanol extract using the simple column chromatography method (step i).A commercial reagent, ethyl piperazine-1-carboxylate, reacted with chloroacetyl chloride and potassium carbonate in the presence of DCM solvent at cold temperature and then increased to room temperature for 8-24 hours (step ii).After that, the intermediate product was reacted with chrysin in the presence of simple reagents (potassium carbonate, potassium iodide, and acetone solvent) using the reflux method for 25-30 hours (step iii).In this procedure, another derivative could be reacted when the intermediate product was attached to C5 of the flavone skeleton.However, due to the steric effects around the C5 position, the side product of the synthesis procedure could be minor.

Conclusion
This study successfully carried out an in silico docking screening between N1-H274Y-oseltamivir protein (PDB ID: 3CL0) with over 3,000 flavones and flavonols from the COCONUT database using the Autodock Vina method.During this process, several potent N inhibitors containing linker groups (e.g., ester, ether, urethane, or amide) showed bifunctional binding with the sialic acid binding cavity and the adjacent 430-cavity positions via the interactions with arginine triad Arg118-Arg292-Arg371.These linker groups were required to connect different structural fragments in order to inhibit the N protein at two distinct locations.Furthermore, results of molecular dynamics simulations, ADMET predictions, and the semi-synthesis design strategy also demonstrated the compound 864, ethyl 4-{2-[(5-hydroxy-4-oxo-2-phenyl-4H-chromen-7yl)oxy]acetyl}piperazine-1-carboxylate, could become the effective bifunctional N inhibitors for further drug design and development, in vivo testing, and clinical trials.The results of this study could contribute to developing new types of N inhibitors from natural products to enhance the efficacy of anti-influenza therapy due to the high mutation rates of N protein in the future.

Figure 1 .
Figure 1.Procedure of molecular docking screening and molecular dynamics of flavones and flavonols from COCONUT database.

Table 3 .
ADMET predicted values of compound 864 using Swiss ADME and pkCSM web service.