Abstract
This work presents the design and validation of a semantic retrieval and natural language generation pipeline aimed at supporting molecular docking studies through compound recommendation and scientific literature contextualization. The pipeline integrates biomedical data processing, semantic indexing with FAISS, and language modeling using BioBERT and TinyLlama-1.1B-Chat to generate concise and informative responses. A curated scientific corpus was built from sources such as PubMed, and embeddings were generated to enable semantic similarity searches. The system was tested using ellagic acid as a query compound, successfully retrieving relevant literature and presenting detailed chemical information from the COCONUT database. In parallel, this research aligns with ongoing efforts from the Semill-IAS research group at Universidad del Rosario, which focuses on inclusive health technologies, particularly the use of artificial intelligence for different health contexts. The proposed solution demonstrates the potential of combining machine learning models with biomedical informatics for accessible, data-driven decision support in health and drug discovery domains.
| Original language | English (US) |
|---|---|
| Title of host publication | Artificial Intelligence and Knowledge Processing - 5th International Conference, Proceedings |
| Editors | Hemachandran Kannan, Raul Villamarin Rodriguez, Manjeet Rege, Vincenzo Piuri, Abejide AdeIbijola, Miguel López González de León, Imed Ben Dhaou |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 53-63 |
| Number of pages | 11 |
| ISBN (Print) | 9783032147059 |
| DOIs | |
| State | Published - 2026 |
| Event | 5th International Conference on Artificial Intelligence and Knowledge Processing, AIKP 2025 - Hyderabad, India Duration: Oct 23 2025 → Oct 25 2025 |
Publication series
| Name | Communications in Computer and Information Science |
|---|---|
| Volume | 2804 CCIS |
| ISSN (Print) | 1865-0929 |
| ISSN (Electronic) | 1865-0937 |
Conference
| Conference | 5th International Conference on Artificial Intelligence and Knowledge Processing, AIKP 2025 |
|---|---|
| Country/Territory | India |
| City | Hyderabad |
| Period | 10/23/25 → 10/25/25 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Mathematics
Fingerprint
Dive into the research topics of 'RAG System Application for Supporting Drug Discovery and Docking Molecular Processes'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver