Routing Struktural Adaptif: Arsitektur Hybrid Chunking untuk RAG pada Dokumen Pembelajaran Teknik Informatika Adaptive Structural Routing: Hybrid Chunking Architecture for RAG in Informatics Engineering Learning Documents

Main Article Content

Rafii Ahmad Fahreza
Muhammad Anwar

Abstract

Retrieval-Augmented Generation (RAG) has received attention in various studies, but research specifically addressing adaptive chunking strategies for Indonesian-language Informatics learning documents remains very limited. This study aims to design a hybrid adaptive chunking system that routes each document section to an appropriate chunking strategy based on structural signals detected at the preprocessing stage. This study used a Design and Development Research (DDR) approach through the stages of document analysis, system architecture design, and expert validation involving three experts in Informatics and Natural Language Processing (NLP). Data were collected through structured expert review instruments and scenario walkthrough sessions. The results showed that rule-based structural detection was able to reliably distinguish heading, narrative, list, and code block sections, supported by a confidence-based fallback mechanism. The conclusion of this study affirms that hybrid adaptive chunking plays an important role in maintaining the semantic coherence of learning materials in RAG systems. These findings contribute to the development of adaptive information retrieval studies and broaden understanding of RAG design aligned with pedagogical needs in the Indonesian-language academic context. The implications of this study include the provision of a reusable design framework for Indonesian-language technical documents and practical guidance for developers of educational AI systems.

Downloads

Download data is not yet available.

Scopus Citation Data

Data source Crossref
0
citations
Check Secondary Documents in Scopus
Open this article in Scopus, then check the Secondary documents tab. Use Manual Citation Fallback only for counts you have verified manually.
Open in Scopus
Similar Scopus Articles
Scopus
  1. Koulaouzidis A. (2027)
    On the Clinical Utility of Hybrid Endoscopy in Crohn's Disease
    Den Open, 7(1)
  2. Yang Y. (2027)
    Interface engineering to construct TpBpy-COF-bridged organic-inorganic hybrid heterojunction for enhanced photoelectrochemical cathodic protection performance and stability
    Journal of Materials Science and Technology, 279, 56-65
  3. Zhang Y. (2027)
    Intention-aware hybrid motion-force control with weighted normal estimation for telerobotic ultrasonic testing
    Robotics and Computer Integrated Manufacturing, 103

Article Details

How to Cite
Fahreza, R. A., & Anwar, M. (2026). Routing Struktural Adaptif: Arsitektur Hybrid Chunking untuk RAG pada Dokumen Pembelajaran Teknik Informatika. MASALIQ, 6(3), 1356-1365. https://doi.org/10.58578/masaliq.v6i3.10134

References

Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa

Chaerul Haviana, S. F., Agus Riyadi, M., & Kusumaningrum, R. (2025). Evaluation of chunking strategies in RAG application for explicit retrieval on Indonesian language scientific papers. 2025 12th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), 59–65. https://doi.org/10.1109/EECSI67060.2025.11290624

Darmawan, F., Purnama, W. G., & Nurcahyo, A. A. (2025). Prototipe Sistem Chatbot Panduan Akademik Fakultas Teknik Unpas Menggunakan Large Language Model. Jurnal Sistem dan Informatika (JSI), 19(2), 72–82. https://doi.org/10.30864/jsi.v19i2.733

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT 2019, 4171–4186. https://doi.org/10.18653/v1/N19-1423

Ellis, T. J., & Levy, Y. (2010). A guide for novice researchers: Design and development research methods. Proceedings of the Informing Science and IT Education Conference, 107–118. https://doi.org/10.28945/1309

Elysia, S., & Herianto. (2024). Chatbot Berbasis Retrieval Augmented Generation (RAG) untuk Peningkatan Layanan Informasi Sekolah. Journal TIFDA (Technology Information and Data Analytic), 1(2), 52–58. https://doi.org/10.70491/tifda.v1i2.52

Firdaus, D., Sumardi, I., & Kulsum, Y. (2024). Integrating Retrieval-Augmented Generation with Large Language Model Mistral 7B for Indonesian medical herb. JISKA (Jurnal Informatika Sunan Kalijaga), 9(3), 230–243. https://doi.org/10.14421/jiska.2024.9.3.230-243

Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., & Wang, H. (2023). Retrieval-augmented generation for large language models: A survey. arXiv. https://arxiv.org/abs/2312.10997

Hidayat, L. R., Wijaya, I. G. P. S., & Dwiyansaputra, R. (2025). Optimalisasi Layanan Sistem Informasi Mahasiswa dengan Integrasi Telegram: Chatbot Retrieval-Augmented-Generation Berbasis Large Language Model. Jurnal Teknologi Informasi, Komputer, dan Aplikasinya (JTIKA), 7(1), 121–131. https://doi.org/10.29303/jtika.v7i1.459

Karpukhin, V., Oğuz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., & Yih, W. (2020). Dense passage retrieval for open-domain question answering. Proceedings of EMNLP 2020, 6769–6781. https://doi.org/10.18653/v1/2020.emnlp-main.550

Lawshe, C. H. (1975). A quantitative approach to content validity. Personnel Psychology, 28(4), 563–575. https://doi.org/10.1111/j.1744-6570.1975.tb01393.x

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33, 9459–9474. https://proceedings.neurips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5-Abstract.html

Pujiono, I., Agtyaputra, I. M., & Ruldeviyani, Y. (2024). Implementing Retrieval-Augmented Generation and Vector Databases for Chatbots in Public Services Agencies Context. JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer), 10(1), 216–223. https://doi.org/10.33480/jitk.v10i1.5572

Richey, R. C., & Klein, J. D. (2014). Design and development research: Methods, strategies, and issues. Routledge. https://doi.org/10.4324/9781410611925

Riduwan. (2013). Skala Pengukuran Variabel-Variabel Penelitian. Alfabeta.

Samudra, G., Turmudi Zy, A., & Ermanto. (2025). Implementation of Retrieval Augmented Generation (RAG) in the design of digestive health chatbot. Journal of Soft Computing Exploration (JSAI), 8(1), 181–188. https://doi.org/10.36085/jsai.v8i1.7678

Shi, W., Min, S., Yasunaga, M., Seo, M., James, R., Lewis, M., Zettlemoyer, L., & Yih, W. (2024). REPLUG: Retrieval-augmented language model pre-training. Proceedings of NAACL 2024, 3301–3316. https://doi.org/10.18653/v1/2024.naacl-long.183

Shulman, L. S. (1986). Those who understand: Knowledge growth in teaching. Educational Researcher, 15(2), 4–14. https://doi.org/10.3102/0013189X015002004

Tohir, H., Merlina, N., & Haris, M. (2024). Utilizing Retrieval-Augmented Generation in Large Language Models to Enhance Indonesian Language NLP. JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer), 10(2), 352–360. https://doi.org/10.33480/jitk.v10i2.5916

Zhao, P., Zhang, H., Yu, Q., Wang, Z., Geng, Y., Fu, F., Yang, L., Zhang, W., Jiang, J., & Cui, B. (2024). Retrieval-Augmented Generation for AI-Generated Content: A survey. arXiv. http://arxiv.org/abs/2402.19473


Explore Our Journals
Find the most suitable journal for your research. If this journal does not fully align with the scope of your manuscript, we invite you to explore our wider portfolio of journals covering diverse fields of study. Please select one of the journals below to identify the most appropriate publication platform for your work.