Research Article |
|
Corresponding author: Violeta Getova-Kolarova ( v.getova@pharmfac.mu-sofia.bg ) Academic editor: Guenka Petrova
© 2025 Veselina Ruseva, Stanimir Dobrev, Violeta Getova-Kolarova, Anna Peneva, Ilko Getov, Maria Dimitrova, Valentina Petkova.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Ruseva V, Dobrev S, Getova-Kolarova V, Peneva A, Getov I, Dimitrova M, Petkova V (2025) In situ development of an artificial intelligence (AI) model for early detection of adverse drug reactions (ADRs) to ensure drug safety. Pharmacia 72: 1-8. https://doi.org/10.3897/pharmacia.72.e160997
|
Pharmacovigilance is a vital component of public health systems, aiming to ensure the safe use of medicinal products. In this study, an artificial intelligence (AI)-based model was developed using TensorFlow to predict the likelihood of adverse drug reactions (ADRs) based on molecular structure and predefined criteria. Data from DrugBank, MedDRA, and SIDER databases were extracted, integrated, and structured in a relational model. A feedforward neural network was trained using chemical and pharmacological descriptors such as SMILES and ATC codes. The model showed consistent performance in estimating ADR risk, highlighting the potential role of AI in supporting early safety assessments. This method may enhance post-marketing surveillance through more timely and data-driven risk identification. Despite certain limitations, AI-assisted modeling represents a valuable addition to pharmacovigilance and patient safety awareness strategies.
pharmacovigilance, artificial intelligence, adverse drug reactions, machine learning, neural networks
Adverse drug reactions (ADRs) represent a significant challenge for public health systems. Meta-analyses and prospective studies have shown that a substantial proportion of hospitalized patients – between 10% and 17%, according to various estimates – experience at least one ADR, with approximately 0.3% of cases resulting in fatal outcomes (
Conventional methods for the assessment of drug safety rely primarily on clinical trials and spontaneous post-marketing reporting. However, the controlled nature of clinical trials, their limited population representativeness, and relatively short duration make them insufficient for identifying rare or delayed-onset ADRs (
Among the most promising approaches are models based on deep neural networks. These algorithms are capable of processing large volumes of heterogeneous biomedical data, including molecular structures, and can detect latent patterns that are often inaccessible through traditional statistical methods (
The study presents a predictive model using a multilayer neural network to assess the risk of ADRs based solely on the chemical structure of pharmaceuticals. This method aims to enhance traditional pharmacovigilance by enabling early safety evaluations in drug development.
In parallel with structure-based models, the integration of real-world data (RWD) – such as patient registries and electronic health records (EHRs) – has gained attention as a complementary source for ADR signal detection. RWD enhances model robustness by incorporating variables that reflect clinical practice, including comorbidities, polypharmacy interactions, and treatment duration. Emerging machine learning frameworks can process large-scale EHRs and identify patient-specific ADR patterns, which supports personalized pharmacovigilance strategies (
This study aims to develop a predictive model for forecasting ADRs, focusing solely on the chemical structures of pharmaceutical compounds and employing advanced deep learning techniques. To achieve this, we implemented a multilayer feedforward neural network using the Python programming environment along with the TensorFlow library, which provides various types of models suitable for training.
The dataset consisted of 482 distinct drug molecules represented by SMILES notations. These were converted into numerical format using established molecular descriptors (Table
| Name | Smiles |
|---|---|
| Epinephrine | CNC[C@H](O)C1=CC(O)=C(O)C=C1 |
| Benzydamine | CN(C)CCCOC1=NN(CC2=CC=CC=C2)C2=CC=CC=C12 |
| Amlexanox | CC(C)C1=CC2=C(OC3=NC(N)=C(C=C3C2=O)C(O)=O)C=C1 |
| Famotidine | NC(N)=NC1=NC(CSCCC(N)=NS(N)(=O)=O)=CS1 |
| Nizatidine | CNC(NCCSCC1=CSC(CN(C)C)=N1)=C[N+]([O-])=O |
| Omeprazole | COC1=CC2=C(C=C1)N=C(N2)S(=O)CC1=NC=C(C)C(OC)=C1C |
| Lansoprazole | CC1=C(OCC(F)(F)F)C=CN=C1CS(=O)C1=NC2=CC=CC=C2N1 |
| Mebeverine | CCN(CCCCOC(=O)C1=CC(OC)=C(OC)C=C1)C(C)CC1=CC=C(OC)C=C1 |
| Dicyclomine | CCN(CC)CCOC(=O)C1(CCCCC1)C1CCCCC1 |
| Propantheline | CC(C)[N+](C)(CCOC(=O)C1C2=CC=CC=C2OC2=CC=CC=C12)C(C) C |
| Mepenzolate | C[N+]1(C)CCCC(C1)OC(=O)C(O)(C1=CC=CC=C1)C1=CC=CC=C1 |
| Alosetron | CN1C2=C(C3=CC=CC=C13)C(=O)N(CC1=C(C)NC=N1)CC2 |
| Pinaverium | COC1=C(OC)C=C(C[N+]2(CCOCCC3CCC4CC3C4(C)C)CCOCC2)C (Br)=C1 |
| Hyoscyamine | CN1[C@H]2CC[C@@H]1C[C@@H](C2)OC(=O)[C@H](CO)C1=CC= CC=C1 |
| Metocloprami de | CCN(CC)CCNC(=O)C1=CC(Cl)=C(N)C=C1OC |
| Domperidone | ClC1=CC2=C(C=C1)N(C1CCN(CCCN3C(=O)NC4=CC=CC=C34)CC1)C(=O)N2 |
Visual representation of SMILES and the process of molecular deconstruction. Adapted from Wu JN, Wang T, Chen Y, Tang LJ, Wu HL, Yu RQ. t-SMILES: a fragment-based molecular representation framework for de novo ligand design. Nat Commun. 2024 Jun 11;15(1): 4993. https://doi.org/10.1038/s41467-024-49388-6.
ADR annotations were then sourced from reliable databases such as DrugBank and SIDER. The study focused on six clinically relevant ADRs across different physiological systems, including hepatotoxicity, nephrotoxicity, and photosensitivity.
In the second phase of our study, we introduced molecules that had not been included in prior training into the model. This approach aimed to evaluate whether assumptions could be derived regarding their safety profiles based on clinically significant ADRs and possible similarities in chemical structures with well-characterized compounds. The predictive task was then framed as a multi-label binary classification problem. The model assesses the likelihood of each ADR’s presence or absence for a given molecule. The model was trained on a molecular dataset, validated through data partitioning, and externally tested with compounds excluded from training. This external cohort included both well-known medications and novel substances, allowing for a thorough evaluation of the model’s generalization.
For compounds with robust clinical documentation (well-characterized medicines), the model produced predictions largely aligned with known data. For example, in the case of the antibiotic erythromycin, the model accurately predicted the risk of hepatotoxicity – an ADR well known to be associated with this therapy (Fig.
At the same time, the model demonstrated nuance in its estimates: it underestimated the risk of nephrotoxicity for cisplatin relative to its actual clinical incidence, while overestimating the probability of photosensitivity – a reaction that is not typically associated with the drug. These cases illustrate the model’s occasional tendency to underpredict serious risks (such as renal toxicity) or, conversely, to flag potential risks that are poorly documented or clinically unlikely (e.g., photosensitivity in cisplatin). It is important to note that such discrepancies are often related to limitations in the training data or to atypical structural features of a compound that may mislead the model.
For experimental drugs such as ezeprogind and enadoline, which are not well described in the literature, the model predicted mostly low probabilities across all targeted ADRs (Figs
For instance, ezeprogind, currently in early-stage clinical research, has not shown serious adverse effects in limited trials, consistent with the model’s low predicted probabilities (
A summary of model outputs for representative compounds is shown in Table
| Drug | Known ADRs | Predicted ADRs | Probability (%) | Model Confidence | Comment |
|---|---|---|---|---|---|
| Erythromycin | Hepatotoxicity | Hepatotoxicity | 94 | High – aligns with established ADR | Prediction matches documented hepatotoxic profile. |
| Cisplatin | Nephrotoxicity, Hypertension | Photosensitivity, Hypertension | 88 (Photosensitivity), 67 (Hypertension) | Moderate – correct on hypertension, missed renal toxicity | Fails to capture key nephrotoxic effect, overestimates photosensitivity. |
| Ezeprogind | None (experimental) | None | <15 | Low – conservative with insufficient data | Reflects cautious model behavior in the absence of reference data. |
| Enadoline | None (experimental) | None | <10 | Low – consistent with limited clinical knowledge | No structural risk indicators detected; outcome expected for investigational drug. |
The model presented results as percentage probabilities for the occurrence of each ADR, requiring careful interpretation. We introduced confidence thresholds to help contextualize these values:
The developed model produced mostly reliable ADR predictions and demonstrates potential as a valuable tool for early-stage drug safety evaluation. Its strengths are most apparent for drugs and reactions with ample reference data, where the predictions closely align with known safety profiles. Discrepancies (such as the underestimated nephrotoxicity risk for cisplatin or overestimation of unlikely reactions) underscore the importance of high-quality, comprehensive training data. The model was trained on a limited set of compounds, which helps explain occasional inaccuracies in more unusual cases.
Expanding the training dataset in terms of size, diversity, and data accuracy would likely improve predictive performance and reduce error rates. Despite these limitations, the current results illustrate the value of such a tool in early drug development. By offering risk estimates for ADRs based solely on chemical structure, the model supports researchers and clinicians in identifying potential safety concerns before clinical trials commence (
The increasing focus on integrating AI into drug development has drawn attention from regulatory authorities such as the U.S. FDA and the European Medicines Agency (EMA). Both agencies have begun issuing guidance and recommendations on the responsible application of AI/ML technologies, particularly in preclinical safety assessments. The FDA’s 2021 Good Machine Learning Practice (GMLP) principles and EMA’s 2023 Reflection Paper emphasize transparency, traceability, and model interpretability. These principles are highly relevant to ADR prediction tools, which must demonstrate reliability, explainability, and clinical utility in order to gain regulatory acceptance (FDA 2021; EMA 2023).
In this context, the results from the current study show that the developed model partially aligns with such regulatory expectations. Overall, it demonstrates acceptable accuracy in predicting ADRs. We could conclude that it successfully identified many expected reactions while producing relatively few false positives. Sensitivity across different ADR categories was adequate, particularly for well-documented and commonly occurring reactions, where the model rarely failed to detect an event. Nonetheless, performance variability was evident, especially in the prediction of rare or underrepresented ADRs, where model confidence was lower and the risk of missed predictions correspondingly higher. Despite these limitations, the neural network showed a consistent behavior pattern in ADR prediction, in line with findings reported in other recent studies employing similar architectures (
This study demonstrates that deep neural network-based models can serve as effective tools for predicting ADRs using only the chemical structure of pharmaceutical compounds. Training the model on data sourced from established databases such as DrugBank, SIDER, and MedDRA enabled the development of a functional predictive algorithm that accurately identifies known ADRs in well-characterized drugs and provides plausible risk estimates for less-studied or novel compounds.
The results support the central hypothesis of the research that artificial intelligence methods can accelerate and enhance the detection of potential toxicities, serving as a complementary asset to conventional pharmacological approaches. The model delivers quantitative risk assessments that can guide early-stage drug development, aiding in the selection of safer molecular candidates.
Nevertheless, the created model shows some limitations. First and foremost, the training dataset is limited in scope, both in terms of the number of pharmaceutical compounds and the diversity of ADR types included. This constraint may hinder the model’s generalizability and increase the risk of missed predictions for rare reactions or novel molecules (Mohsen et al. 2021).
Secondly, the structural representation of drugs using standard vectorized descriptors, while broadly applicable, fails to capture important molecular characteristics such as stereochemistry and electron density. As a result, two agents with differing toxicity profiles might be interpreted as structurally similar by the model (
Thirdly, the absence of dose-dependent context represents a critical limitation. Many clinically relevant ADRs are dose-dependent and influenced by treatment duration, drug–drug interactions, and individual patient conditions. In its current form, the model evaluates risk as a binary function of structure alone, making it incapable of integrating pharmacokinetic considerations (
Despite these shortcomings, the created model offers practical value in several important areas. Firstly, its application in early drug development stages can save substantial resources by filtering out compounds with unfavorable safety profiles even before in vitro or in vivo testing (
Looking ahead, several opportunities for model enhancement are evident. Expanding the dataset to include more compounds, rare ADRs, and diverse populations would improve representativeness and generalization (
Significant advancements could also come from the use of graph neural networks (GNNs). It operates directly on molecular graphs and shows strong performance in extracting structural relationships (
In summary, the presented model shows applicability across various aspects of drug safety, from molecular design to regulatory monitoring. These approaches are not intended to replace traditional pharmacovigilance methods but rather to complement them through systematic and proactive risk identification. The continued development and integration of such AI models into scientific and clinical workflows represent a key step toward more precise and timely drug safety assessment (Montastruc et al. 2023).
From a practical perspective, the implementation of such AI-driven systems could significantly strengthen pharmacovigilance – both during the evaluation of new substances and in post-marketing surveillance. These models can serve as adjunct tools in safety monitoring strategies, facilitating earlier signal detection and improving agile regulatory responses.
However, it is essential to emphasize that the model’s effectiveness is directly dependent on the scale and quality of the input data. Limitations such as a narrow spectrum of drug agents and the absence of dose-dependent or patient-specific context remain key challenges. Future efforts should focus on expanding the training datasets, integrating multimodal data (including pharmacokinetic and genetic information), and improving model interpretability.
Artificial intelligence is emerging as a promising instrument in the evaluation of drug safety. While it does not replace clinical judgment, laboratory data, or expert evaluation, it can significantly enhance them by enabling systematic, large-scale, and timely ADR prediction—contributing to safer therapies, more informed decisions, and better outcomes for patients.
Conflict of interest
The authors have declared that no competing interests exist.
Ethical statements
The authors declared that no clinical trials were used in the present study.
The authors declared that no experiments on humans or human tissues were performed for the present study.
The authors declared that no informed consent was obtained from the humans, donors or donors’ representatives participating in the study.
The authors declared that no experiments on animals were performed for the present study.
The authors declared that no commercially available immortalised human and animal cell lines were used in the present study.
Use of AI
No use of AI was reported.
Funding
This study is financed by the European Union-NextGenerationEU, through the National Recovery and Resilience Plan of the Republic of Bulgaria, project BG-RRP-2.004-0004-C01 “Strategic research and innovation program for development of Medical university – Sofia“.
Author contributions
All authors participated in the conduct of the study and preparation of the current manuscript, as follows: Conceptualization: VR, IG; Methodology: VR, IG, MD, VG-K; Writing – original draft preparation: VR, MD, IG; Writing – review and editing: MD, IG, VG-K, SD, AP, VP; Visualization: VR; Supervision: IG, MD, VP; Funding acquisition: MD, VP, IG, VG-K.
Author ORCIDs
Violeta Getova-Kolarova https://orcid.org/0000-0002-7103-3892
Maria Dimitrova https://orcid.org/0000-0002-4868-7775
Valentina Petkova https://orcid.org/0000-0002-6938-1054
Data availability
All of the data that support the findings of this study are available in the main text.