Research Article |
Corresponding author: Fredmoore Orosco ( orosco.fredmoore@gmail.com ) Academic editor: Emilio Mateev
© 2024 Lauren Emily Fajardo, Mark Andrian Macalalad, Nyzar Mabeth Odchimar, John Christian de Guzman, Fredmoore Orosco.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Fajardo LE, Macalalad MA, Odchimar NM, de Guzman JC, Orosco F (2024) Computational design and validation of siRNA molecules to silence oncogenic CTNNB1 mRNA as a potential therapeutic strategy against hepatitis B/C virus-associated hepatocellular carcinoma. Pharmacia 71: 1-14. https://doi.org/10.3897/pharmacia.71.e127981
|
The majority of hepatocellular carcinoma cases are caused by infection with hepatitis B (HBV) or C (HCV) viruses. CTNNB1 is the most mutated oncogene in HBV- and HCV-associated tumors. CTNNB1 mutations can lead to β-catenin accumulation, resulting in tumor progression. Small interfering RNAs (siRNAs) can be used to silence CTNNB1 mRNA. After prediction and evaluation, four siRNAs were found to have the highest silencing potential. All four siRNAs had an acceptable GC content, no palindromic sequences, no off-targets, were thermostable, and had accessible target sites. Molecular docking of the siRNAs to Argonaute 2 demonstrated favorable docking scores within the binding pocket for three siRNAs. Molecular dynamics simulations and binding energy calculations demonstrated that the siRNAs steadily remained in the binding pocket. In this study, three siRNAs were successfully designed to silence oncogenic CTNNB1 mRNA as a therapeutic strategy against hepatocellular carcinoma and warrant further in vitro and in vivo validation.
Argonaute 2 (Ago2), molecular docking, molecular dynamics, small interfering RNA (siRNA), RNA interference (RNAi)
Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer and accounts for approximately 90% of all cases (
Somatic mutations in various oncogenes and tumor suppressor genes are considered drivers of HCC development. Some of these mutations may be caused by the integration of HBV DNA into the host genome (
Graphical representation of how HBV/HCV-induced CTNNB1 mutations drive HCC progression. HBV/HCV infection can introduce somatic mutations in exon 3 of CTNNB1, impairing the phosphorylation site and subsequently preventing the degradation of β-catenin, allowing aberrant activation of transcription factors, and driving HCC progression. hepatitis B virus; HCV, hepatitis C virus; TCF/LEF, T-cell factor/lymphoid enhancer factor; HCC, hepatocellular carcinoma.
RNA interference (RNAi) is a naturally occurring biological process in which short, double-stranded RNAs, such as microRNAs (miRNAs) and small interfering RNAs (siRNAs), recognize and degrade mRNA via post-transcriptional gene silencing. In the siRNA RNAi pathway, double-stranded RNA (dsRNA) is recognized by Dicer and cleaved into a siRNA duplex (
Graphical representation of the RNAi pathway for siRNAs. The siRNA guide strand is loaded onto Ago2, while the passenger strand is discarded. Ago2 associates with other proteins to form RISC and facilitates binding of the guide strand to a complementary mRNA sequence. The endonuclease activity of Ago2 cleaves the mRNA, thereby silencing its expression. Ago2, Argonaute 2.
The use of computational tools to predict and evaluate potential siRNA candidates allows for the efficient screening of hundreds to thousands of short RNA sequences at once. The selection of siRNAs with the highest silencing potential was successfully validated experimentally in previous studies. As such, siRNAs that were computationally designed against the MERS-CoV-orf1ab gene using siRNA prediction algorithms and evaluation of GC content, off-targets, target site accessibility, thermodynamic stability, and secondary structure formation were validated and were found to inhibit viral replication in Vero cells (
The regulation of β-catenin expression through siRNA-mediated silencing of CTNNB1 may serve as a potential therapeutic strategy against HCC. To rapidly screen and evaluate all possible sequences, this study aimed to design siRNAs targeting the coding sequence (CDS) of oncogenic CTNNB1 using computational methods. In particular, this study aimed to predict and evaluate potential siRNAs targeting CTNNB1 using known algorithms and model the behavior of the top siRNA candidates through molecular docking with Ago2 and molecular dynamics simulation.
Thirty-nine (39) transcript variants of the CDS of CTNNB1 were retrieved from Ensembl (
The conserved regions of the CTNNB1 CDS were used to predict potential siRNAs in siDirect 2.1 (
Algorithms used in siDirect 2.1. To be predicted as a potential siRNA, an RNA sequence must satisfy all four Ui-Tei rules, at least six Reynolds rules, or at least three Amarzguioui rules.
Ui-Tei | Reynolds | Amarzguioui |
---|---|---|
1. A/U at 5’ end of antisense strand | 1. 30–52% GC content | 1. A/U content differential between the terminal 3 nucleotides of both ends |
2. G/C at 5’ end of sense strand | 2. ≥3 A/U at positions 15–19 of sense strand | 2. No U at position 1 of sense strand |
3. ≥5 A/U residues in 5’ terminal one-third of antisense strand | 3. No internal repeats | 3. A at position 6 of sense strand |
4. No GC stretch ≥9 nt | 4. A at position 19 of sense strand | 4. Strong attachment of 5’ end of sense strand |
5. A at position 3 of sense strand | 5. Weak binding of 3’ end of sense strand | |
6. U at position 10 of sense strand | 6. No G at position 19 of sense strand | |
7. No G/C at position 19 of sense strand | 7. G/C at position 1 of sense strand | |
8. No G at position 13 of sense strand | 8. A/U at position 1 of sense strand |
The GC content of the siRNA duplexes and the presence of palindromic sequences in both the guide and passenger strands of each siRNA were determined using the OligoCalc server (
Target site accessibility within the target mRNA of each guide siRNA was predicted using the siRNA module of Sfold (
The secondary structure and free energy of folding of the guide strand of each siRNA were predicted using MaxExpect (
The crystal structure of human Ago2 was retrieved from the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) (PDB ID: 4Z4D) (
The tertiary structures of the guide siRNAs were modeled using BIOVIA Discovery Studio Visualizer v21.1.0.20298 based on the template miRNA bound to the crystal structure of Ago2. The template RNA sequence was mutated based on the siRNA sequence. The geometry of the siRNA was then optimized using the CHARMM27 force field.
The optimized siRNA was then docked to Ago2 using HDOCK (
Molecular dynamics (MD) simulations of the docked siRNA-Ago2 complexes were performed using GROMACS 2023.3 to assess their stability. The topology and parameter files were created using the CHARMM36 force field (
The siRNA molecules that exhibited stable behavior were further assessed for their binding affinities against Ago2. The GROMACS-compatible gmx_MMPBSA package (
ΔGbinding = ΔGvdW + ΔGelectrostatic + ΔGpolar + ΔGnonpolar (1)
The van der Waals and electrostatic energies were calculated using the Lennard-Jones potential and Coulomb’s law, respectively. The polar solvation energy was estimated using the Poisson-Boltzmann and generalized Born equations, while the nonpolar solvation energy was assumed to be linearly dependent on the solvent-accessible surface area (SASA). Only the last 10 ns, representing the stable phase of the trajectory, were investigated.
A total of 39 CTNNB1 CDS transcripts were retrieved from Ensembl and aligned using ClustalW in MEGA11. A complete list of the transcripts is shown in Suppl. material
From the CTNNB1 CDS consensus sequence, a total of 45 siRNAs were predicted in siDirect 2.1 using the Ui-Tei, Reynolds, and Amarzguioui algorithms. Each siRNA fulfilled the requirements of at least one of the three algorithms, with nine (9) siRNAs fulfilling the requirements of all three. Of the 45 initially predicted siRNAs, 27 were cross-validated using siExplorer, OligoWalk, and siPRED.
Of the 27 predicted siRNAs, 19 had a GC content within the range of 31.6–57.9%, and 16 siRNAs did not have palindromic sequences in both their guide and passenger strands, as predicted by OligoCalc. Ten (10) siRNAs did not have any off-targets as they had at least two (2) mismatches in positions 2–20 of both the guide and passenger strands to any non-targeted transcripts in BLAST. This was further confirmed in GESS as off-targets were not significant by Fisher, Bonferroni, Bonferroni step-down (Holm) correction, and Benjamini and Hochberg methods.
The target site accessibility of the remaining ten (10) siRNAs was predicted using Sfold. The target accessibility scores of the siRNAs ranged between 5 and 8, with 0 being the lowest possible score and 8 being the highest possible score. The antisense binding energy of the siRNAs ranged between -27 and -11.8 kcal/mol. The DSSE of the siRNAs ranged from 0.8 to 7.5 kcal/mol. Five (5) out of ten (10) siRNAs had duplex feature scores ranging from 7–9 and were used for further analysis. The other five (5) siRNAs had scores ranging from 5 to 6 and were eliminated. The AIS of the siRNAs ranged from -8.2 to -5.1 kcal/mol. The duplex thermodynamics score can be 0, 1, or 2, and simply assigns one (1) point for each threshold met for the DSSE and AIS scores. Since all five (5) siRNAs met both thresholds, it follows that they all had duplex thermodynamics scores of 2. Finally, the total score is the sum of the accessibility, duplex feature, and duplex thermodynamics scores, with a maximum possible score of 20 and scores greater than or equal to 12 being acceptable (
Overall, five (5) siRNAs passed all evaluations: s3, s4, s19, s22, and s28. However, since s3 and s4 targeted overlapping regions in the CTNNB1 CDS (positions 205–227 and 208–230, respectively) and designing two such siRNAs would be redundant, s3 was chosen over s4 as it scored better overall in all evaluations. Therefore, four (4) siRNAs, s3, s19, s22, and s28, were chosen as the top four siRNA candidates (Table
CTNNB1 CDS target position | alias | guide strand | seed-duplex Tm (guide) (°C) | seed-duplex Tm (passenger) (°C) | GC content (%) |
---|---|---|---|---|---|
205–227 | s3 | AAUAUCAGCUACUUGUUCUUG | 13.3 | 19.2 | 33 |
730–752 | s19 | UGUAAUGGCAUAAAACAACAC | 20.0 | 5.6 | 33 |
770–792 | s22 | UUUUAGCUCCUUCUUGAUGUA | 18.3 | 20.4 | 33 |
964–986 | s28 | UUUUUCGUAAGUAUAGGUCCU | 15.3 | 19.8 | 35 |
The secondary structures of the siRNA guide strands were predicted using MaxExpect (Fig.
The 3D structure of Ago2 was predicted using SWISS-MODEL and refined using GalaxyRefine for molecular docking and molecular dynamics with the siRNAs. From the five generated models listed in Table
Quality plots of the refined Ago2 model. A. Overall model quality plot plotting the z-score of the refined Ago2 model (black dot) against z-scores of PDB structures experimentally determined by X-ray crystallography (light blue) or nuclear magnetic resonance (NMR) spectroscopy (dark blue); B. Local model quality plot showing the knowledge-based energy at each position of the refined Ago2 model.
Model | GDT-HA | RMSD | MolProbity | Clash score | Poor rotamers | Rama favored |
---|---|---|---|---|---|---|
Initial | 1.0000 | 0.000 | 1.182 | 1.6 | 0.5 | 95.9 |
Model 1 | 0.9902 | 0.264 | 1.549 | 10.7 | 0.4 | 98.6 |
Model 2 | 0.9890 | 0.276 | 1.582 | 11.7 | 0.1 | 98.7 |
Model 3 | 0.9878 | 0.286 | 1.582 | 11.7 | 0.4 | 98.7 |
Model 4 | 0.9884 | 0.275 | 1.589 | 11.9 | 0.1 | 99.0 |
Model 5 | 0.9869 | 0.283 | 1.569 | 11.3 | 0.4 | 98.9 |
After modeling Ago2, molecular docking was performed with each siRNA using HDOCK (Table
Molecular docking results for the top four siRNA candidates. All siRNA candidates had highly negative docking scores and interacted with residues in both the PAZ and MID domains.
Alias | Interacting residues in PAZ and MID domains |
---|---|
s3 | PAZ: Y279, F294, Q297, Y311 MID: G524, K525, T526, Y529, K533, C546, Q548, N562, K566 |
s19 | PAZ: G264, R277, R280, Y311, F312, E333, K335, L339 MID: K550, N551, R554 |
s22 | PAZ: R277, K278, R280, F294, Y311, K335, Y338 MID: K525, Y529, K533, Q545, Q548, N551, N562, K566 |
s28 | PAZ: R277, R280, F294, P295, L296, Q297, Y311 MID: L522, G524, K525, T526, Y529, K533, C546, V547, Q548, T559, N562, K566 |
s0 | PAZ: K266, R277, Y279, R280, L296, Y311, H316, Q332, K335, H336, Y338 |
MID: G524, K525, T526, Y529, K533, Q545, C546, V547, Q548, N551, N562, L563, K566 |
The stability of the siRNA complexes was investigated through a 100-ns simulation to determine how the structure changed over time relative to its starting conformation [see Suppl. materials
RMSF plot of the siRNA molecules and the reference miRNA (s0). s19 displayed high RMSF values, particularly for the 1st and 21st nucleotides, which contributed to the relatively higher values in its RMSD trajectory. The remaining siRNA molecules presented RMSF plots similar to those of the reference miRNA.
The binding affinity of the four siRNA molecules to Ago2 was evaluated using both the MMPBSA and MMGBSA methods. In both methods, all four remaining siRNA molecules registered stronger binding energies (MMPBSA: -412.46 kcal/mol to -465.57 kcal/mol, MMGBSA: -313.0 kcal/mol to -379.57 kcal/mol) compared to the experimentally derived control miRNA (MMPBSA: -377.06 kcal/mol, MMGBSA: -293.69 kcal/mol), indicating a higher affinity for Ago2 (Fig.
Table
Ago2 residues with the strongest binding energy contribution. Residues highlighted in bold and italicized are the common interacting residues among all four siRNAs and the control miRNA (s0). Values are in kcal/mol.
s0 | s3 | s19 | s22 | s28 | |||||
---|---|---|---|---|---|---|---|---|---|
R68 | -7.081 | R710 | -6.366 | R795 | -5.737 | R710 | -7.028 | R68 | -7.501 |
R179 | -4.425 | R68 | -5.959 | R351 | -5.720 | R68 | -5.869 | R761 | -5.181 |
R792 | -4.189 | R761 | -5.332 | R68 | -5.422 | R351 | -4.988 | P67 | -5.010 |
R280 | -4.126 | P67 | -5.034 | R126 | -5.033 | R277 | -4.742 | R710 | -4.767 |
R277 | -3.792 | R792 | -4.407 | I756 | -4.751 | R795 | -4.682 | R126 | -4.189 |
Q757 | -3.515 | R795 | -4.280 | R710 | -4.217 | R179 | -4.584 | Y279 | -3.812 |
R761 | -3.420 | R126 | -4.255 | K65 | -4.169 | R792 | -4.572 | R179 | -3.663 |
K124 | -3.065 | Y279 | -4.241 | R315 | -3.648 | R761 | -3.651 | I756 | -3.616 |
R795 | -3.062 | Y529 | -4.025 | I365 | -3.426 | P67 | -3.637 | R792 | -3.593 |
Y529 | -3.014 | R280 | -3.813 | R179 | -3.358 | R280 | -3.612 | R635 | -3.520 |
G123 | -2.987 | K124 | -3.807 | K124 | -2.933 | Y529 | -3.364 | I365 | -3.348 |
G178 | -2.870 | R97 | -3.677 | R277 | -2.903 | I756 | -3.080 | R795 | -3.126 |
R351 | -2.811 | I365 | -3.621 | R792 | -2.774 | Q757 | -3.068 | Q757 | -3.074 |
R710 | -2.775 | R179 | -3.597 | G178 | -2.552 | F294 | -3.027 | Y529 | -3.036 |
The use of computational tools to efficiently predict and evaluate potential siRNAs has been previously employed in targeting various genes, whether within the host or the pathogen, for the treatment of several diseases. This study offers a novel, more stringent, and broadly targeting screening method for predicting and evaluating siRNAs that are highly likely to have high silencing efficacy in experimental setups. The consideration of several transcript variants of CTNNB1 instead of just the most common ones guarantees that the siRNAs will be able to treat a larger cohort of patients. Furthermore, extensive molecular docking and molecular dynamics analyses of the siRNA candidates with Ago2 and comparison to an experimentally derived control offer more extensive insights than prior siRNA design studies.
To facilitate knockdown of β-catenin, siRNAs were designed around the coding region of CTNNB1. The CDS of CTNNB1 was chosen as the region to design siRNAs as it is retained in the mature mRNA and is the region that is translated into β-catenin (
Different siRNA prediction servers implement different algorithms that consider various parameters that affect the siRNA efficacy. A total of 45 siRNAs were initially predicted using siDirect 2.1. siDirect 2.1 uses the Ui-Tei, Reynolds, and Amarzguioui algorithms (
The 27 predicted siRNAs underwent evaluation for GC content, presence of palindromic sequences, presence of off-targets, secondary structure formation, target site accessibility, and thermostability. The GC content and presence of palindromic sequences were checked using OligoCalc. A high GC content has been suggested to inhibit the unwinding of the siRNA duplex, preventing RISC loading (
Target site accessibility was assessed using Sfold. Sfold returns various scores that assess the functionality of the siRNA throughout the entire RNAi pathway, from loading onto RISC to binding and cleavage of the target mRNA. A siRNA that meets the thresholds for all scores is highly likely to be functional. The target accessibility score is based on the antisense (guide) siRNA binding energy, which is the weighted sum of the RNA/RNA stacking energies of the mRNA-siRNA duplex. The more negative the antisense siRNA binding energy, the higher the score. According to the target accessibility rule, the antisense siRNA binding energy must be less than or equal to -10 kcal/mol. The AIS values of the candidate siRNAs indicate that the mRNA-siRNA duplexes were predicted to have favorable stacking interactions.
To silence the intended target site, the correct siRNA strand must first be loaded onto RISC. The DSSE is the difference in stability between the 5’ end of the guide strand and the 5’ end of the passenger strand. The DSSE must be greater than 0 kcal/mol (i.e., the 5’ end of the guide strand is less stable than the 5’ end of the passenger strand), as RISC forms a complex with the strand with the less stable 5’ end (
After ensuring that the guide strand is loaded onto RISC, its ability to bind to the target mRNA so that Ago2 can be correctly positioned to perform cleavage activity is assessed. The AIS is the average of internal stability values for positions 9–14 of the guide strand and should be greater than or equal to -8.6 kcal/mol.
For the remaining four (4) siRNA candidates, the secondary structures of the siRNA guide strands, and the mRNA-siRNA duplexes were predicted. A positive free energy of folding for siRNA guide strands indicates higher binding efficiency as the molecule is less prone to secondary structure formation (
The tertiary structures of the siRNAs and Ago2 were modeled in preparation for molecular docking. The siRNAs were modeled by mutating the bases of the template miRNA so as to retain the conformation of the small RNA when it is within the binding pocket of Ago2. The tertiary structure of Ago2 was predicted and refined. GalaxyRefine returns various scores that describe the refined models with respect to the initial model. The global distance test-high accuracy (GDT-HA) and root mean square deviation (RMSD) are measures of similarity between the positions of the alpha carbons of two protein structures with the same amino acid sequence. GDT-HA calculates the proportion of alpha carbons located within a specified proximity to one another (
ProSA-web assesses the overall and local model qualities of the protein structure. The overall model quality plots the z-score of the query structure against that of the experimentally derived protein structures arranged by size. A good-quality structure is expected to have a z-score similar to that of protein structures of similar sizes. The local model quality plots the knowledge-based energy of the protein structure as a function of amino acid sequence position, where negative values indicate good quality regions (
Out of the four (4) candidate siRNAs, three (3) docked to Ago2 in the expected orientation. These three siRNAs had common interacting residues within the MID and PAZ domains. Y529 in particular is highly conserved and serves as an insertion site for the phosphorylated 5’-end of the guide strand. This interaction is critical for the binding and cleavage activity of the guide strand. Meanwhile, the non-specific anchoring of the hydroxylated 3’ overhang of the guide strand to the PAZ domain is crucial for preventing its degradation (
The docked complexes of the four siRNAs underwent 100-ns molecular dynamics simulations. The results of the MD simulations showed that s3 and s22 behaved in a manner more similar to the reference miRNA, while s19 and s28 displayed more erratic fluctuations. The relatively higher RMSD values of the s28-Ago2 complex were most likely caused by the random movements of the nondescript loops of Ago2 interacting with the mobile water molecules. Meanwhile, the high RMSD and RMSF values of the s19-Ago2 complex are likely attributed to the incorrect binding pose of s19, where the 5’ end docked against the PAZ domain and the 3’ end docked against the MID domain. The interaction between the mismatched Ago2 domains and the ends of the siRNA might have generated repulsive forces that weakened the binding affinity and stability of the Ago2-s19 complex compared to the other siRNA complexes. Furthermore, the constant movement of s19 may have hindered its ability to establish a consistent number of hydrogen bond interactions with the protein. Despite its stronger affinity compared to the reference miRNA, as shown by the binding energy calculations, the incorrect orientation of s19 against Ago2 renders it unsuitable as a viable siRNA to silence CTNNB1.
It is interesting to note that the majority of the residues with the strongest binding energy contribution are arginine, a positively charged amino acid. Arginine residues in the PAZ, MID, and PIWI domains are known to significantly interact with siRNAs (
These results show that in silico methods can be employed for the rational design of siRNAs targeting oncogenic CTNNB1. The computational design of therapeutic siRNAs has previously been employed to target wild-type or mutant human genes to combat various diseases, such as mutant tardbp against amyotrophic lateral sclerosis (
In this study, siRNAs were predicted from the CDS of CTNNB1 and evaluated for their silencing capability. The top four siRNA candidates underwent molecular docking and molecular dynamics simulation. s3, s22, and s28 demonstrated stability similar to the reference miRNA (s0), while s19 exhibited relatively weaker stability due to its incorrect binding orientation. Further analysis using MMPBSA and MMGBSA revealed that all four siRNA molecules exhibited stronger binding affinity towards the Ago2 protein compared to the miRNA, underscoring their potential for silencing CTNNB1. Although s19 showed promising binding energy scores, its mismatched binding pose indicates its unsuitability as a siRNA molecule, prompting the need for additional investigation. Additionally, we identified five arginine residues (R68, R179, R710, R792, and R795) with the highest binding energy contribution to siRNA binding. These specific residues hold potential as crucial targets for the optimization of siRNA design. Considering all evaluations, three (3) siRNAs were successfully designed to silence oncogenic CTNNB1, as demonstrated by their favorable properties and stable complex formation with Ago2. Although the results warrant further in vitro and in vivo validation, the current findings provide valuable insights and a strong foundation for further research.
The authors would like to thank the Philippine Council for Agriculture, Aquatic, and Natural Resources Research and Development (PCAARRD), which funded the project under the Virology and Vaccine Research and Development Program. The authors would also like to thank the Department of Science and Technology—S&T Fellows Program, the Department of Science and Technology—Science Education Institute (DOST-SEI) Career Incentive Program, the Department of Science and Technology—Industrial Technology Development Institute (DOST-ITDI), and the Department of Science and Technology—Advanced Science and Technology Institute (DOST-ASTI) Computing and Archiving Research Environment (COARE) for their support of this research.
Supplementary information
Data type: pdf
Explanation note: table S1. CTNNB1 CDS transcript variants retrieved from Ensembl. table S2. Detailed Sfold results of top four siRNA candidates. fig. S1. Concentration plots (top) and heat capacity plots (bottom) of mRNA-siRNA duplexes of top four siRNA candidates. (A) s3, (B) s19, (C) s22, (D) s28. fig. S2. ERRAT plot of the predicted Ago2 tertiary structure showing error value for each residue. fig. S3. Ramachandran plot of the refined Ago2 model. fig. S3. Ramachandran plot of the refined Ago2 model.
s3-Ago2 MD simulation
Data type: mov
s19-Ago2 MD simulation
Data type: mov
s22-Ago2 MD simulation
Data type: mov
s28-Ago2 MD simulation
Data type: mov