Published: 2026-04-10
ETL Pipeline with DTO Normalization for IPOS Data Integration in Spring Boot
DOI: 10.35870/ijsecs.v6i1.6850
Adhi Septian Nugroho, Yeremia Alfa Susetyo
Article Metrics
- Scopus Citations
- Google Scholar
- Crossref Citations
- Semantic Scholar
- DataCite Metrics
-
If the link doesn't work, copy the DOI or article title for manual search (API Maintenance).
Abstract
IPOS point-of-sale software, widely used by Indonesian small and medium retail enterprises (UMKM), exports transaction data as Excel files with no enforced schema—producing format-variable, multi-row receipt blocks with heterogeneous date representations, locale-dependent numeric formats, and embedded unit strings that resist conventional relational import. Transforming these unstructured exports into a relational database requires a structured architectural approach capable of handling format variability, type inconsistency, and record duplication. This study designs and implements a Spring Boot-based ETL (Extract, Transform, Load) service that applies the Data Transfer Object (DTO) pattern through ten purpose-specific DTO classes covering each pipeline phase, structured within a four-layer Model-View-Controller (MVC) architecture (Controller-Service-Repository-Entity). The Extractor employs a streaming Excel reader with dynamic column-layout detection based on header keywords, producing raw String-typed ExtractedReceipt and ExtractedItem DTOs. The Transformer applies six normalization steps via four utility classes—StringNormalizer, DateParser (seven date-format patterns), NumberParser (Indonesian and Western currency formats), and a HashSet-based duplicate detector—converting raw strings into typed ValidatedReceipt and ValidatedItem DTOs with explicit error logging. The Loader performs batch inserts per 1,000 records using pre-loaded duplicate sets for O(1) lookup. The pipeline operates asynchronously, returning a jobId immediately while processing continues on a background thread. Functional evaluation across ten scenarios yielded a 100% pass rate, covering valid files, invalid file types, date-format heterogeneity, embedded-unit quantity strings, Indonesian numeric formats, cross-file and intra-file duplicate detection, grand-total reconciliation tolerance, and product-variation tracking. Performance observation shows that files of 200–500 receipts complete within 5–15 seconds. These results indicate that a DTO-centric, explicitly mapped ETL pipeline over Spring Boot MVC provides a maintainable, auditable, and production-ready solution for UMKM retail data integration.
Keywords
Data Transfer Object; ETL Pipeline; Spring Boot; MVC Architecture; Data Normalization; IPOS; Batch Processing; Async Processing
Peer Review Process
This article has undergone a double-blind peer review process to ensure quality and impartiality.
Indexing Information
Discover where this journal is indexed at our indexing page.
Open Science Badges
This journal supports transparency in research and encourages authors to meet criteria for Open Science Badges.
How to Cite
Article Information
This article has been peer-reviewed and published in the International Journal Software Engineering and Computer Science (IJSECS). The content is available under the terms of the Creative Commons Attribution 4.0 International License.
-
Issue: Vol. 6 No. 1 (2026)
-
Section: Articles
-
Published: 2026-04-10
-
License: CC BY 4.0
-
Copyright: © 2026 Authors
-
DOI: 10.35870/ijsecs.v6i1.6850
AI Research Hub
This article is indexed and available through various AI-powered research tools and citation platforms. Our AI Research Hub ensures that scholarly work is discoverable, accessible, and easily integrated into the global research ecosystem.
Adhi Septian Nugroho, Satya Wacana Christian University
Department of Informatics Engineering, Faculty of Information Technology, Universitas Kristen Satya Wacana, Salatiga City, Central Java Province, Indonesia
-
Aghili, S., Asadi, S., Shukur, Z., & Nematbakhsh, M. A. (2023). Automating extract, transform, load (ETL) pipelines using machine learning triggered workflow optimization. International Journal of Intelligent Systems and Applications in Engineering. https://doi.org/10.18178/ijisae.2023.11.2.7193
-
Ayet, A., Marti-Carvajal, A. J., Agreda-Perez, L. H., & Sola, I. (2023). Data extraction and comparison for complex systematic reviews: A step-by-step guideline. Systematic Reviews. https://doi.org/10.1186/s13643-023-02381-4
-
Barahama, A. D., & Wardani, R. (2022). Analysing user reviews with ETL using Pentaho Data Integration. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-022-12410-4
-
Filho, N. (2025). Best practices for using DTOs (Data Transfer Objects) in a clean architecture. Zenodo Repository. https://doi.org/10.5281/zenodo.14531413
-
Ginting, M., & Rahmatulloh, A. A. (2025). Analysis of the effectiveness of data warehousing and ETL in management information systems using the neural networks method. Telematika: Jurnal Informatika dan Teknologi Informasi, 22(3), 86–97. https://doi.org/10.31315/telematika.v22i3.14027
-
Gobin, M., Duclos, A., Jannot, A. S., & Prouteau, A. (2025). From data extraction to analysis: A comparative study of capabilities in scientific literature. Frontiers in Artificial Intelligence. https://doi.org/10.3389/frai.2025.1587244
-
Gupta, V. K. (2025). Building robust REST APIs with Spring Boot. World Journal of Advanced Engineering Technology and Sciences. https://doi.org/10.30574/wjaets.2025.15.3.1078
-
Hartono, M. A., & Mailoa, E. (2024). Pembuatan website REST API dengan kombinasi framework Spring Boot dan MyBatis Generator. Jurnal Indonesia: Manajemen Informatika dan Komunikasi. https://doi.org/10.35870/jimik.v5i2.712
-
Hayes, M., Cruz, J. D., & Santos, R. A. (2024). Point of sale's transaction data: Envisioning micro small and medium enterprise (MSME)'s inventory management strategy. Management Journal. https://doi.org/10.60016/majcafe.v32.25
-
Hutabalian, P., Ginting, B. R. P., Zaidan, M. Q., Alfisyahrina, N., & Rozikin, C. (2024). Implementasi pipeline ETL/ELT dan model dimensional untuk analisis penjualan Shopee menggunakan PostgreSQL, Docker, dan Apache Superset. JITET (Jurnal Informatika dan Teknik Elektro Terapan), 13(3S1). https://doi.org/10.23960/jitet.v131351.8093
-
Ivanov, A., Kovalev, S., Petrov, M., & Smirnov, D. (2024). The analysis of customers' transactions based on POS and RFID data using big data analytics tools in the retail space of the future. Applied Sciences, 14(24), 11567. https://doi.org/10.3390/app142411567
-
Izonin, I., Tkachenko, R., Shakhovska, N., Ilchyshyn, B., & Singh, K. K. (2022). A two-step data normalization approach for improving classification accuracy in the medical diagnosis domain. Mathematics, 10(11), 1942. https://doi.org/10.3390/math10111942
-
Jánki, Z. R., & Bilicki, V. (2023). The impact of the Web Data Access Object (WebDAO) design pattern on productivity. Computers, 12(8), 149. https://doi.org/10.3390/computers12080149
-
Kurniawan, R. (2025). Implementation of point of sale (POS) systems in culinary MSMEs: Improving operational efficiency. JOINTECS (Journal of Information Technology and Computer Science), 8(1). https://doi.org/10.31328/jointecs.v8i1.7330
-
Li, X., Zhang, Y., Wang, J., & Liu, H. (2022). Handling unstructured Excel data via automated normalization in ETL pipelines. IEEE Access. https://doi.org/10.1109/ACCESS.2022.3159021
-
Pandey, A., Sharma, R., Singh, P., & Kumar, V. (2025). Building robust REST APIs with Spring Boot: A practical guide. Journal of Computer Science and Technology Studies. https://doi.org/10.32996/jcsts.2025.7.4.120
-
Ramadugu, G. (2024). Microservices with Spring Boot: Simplifying distributed systems. International Journal For Multidisciplinary Research. https://doi.org/10.36948/ijfmr.2024.v06i05.28930
-
-
Singi, K. R. (2023). Performance optimization strategies for high-concurrency Spring Boot microservices in enterprise financial systems. The Eastasouth Journal of Information System and Computer Science. https://doi.org/10.58812/esiscs.v1i02.883
-
Supriyanto, A. (2024). Efficiency comparison in prediction of normalization with data mining classification. Advances in Science, Technology and Engineering Systems. https://doi.org/10.25046/aj060415
-
Suryadana, I. G. A., Dewi, N. K. C., & Pratama, I. M. A. (2024). Descriptive analytics sales data visualization with ETL. Technovate. https://doi.org/10.59890/technovate.v1i2.49
-
Sutanto, B. (2024). Optimizing decision making in MSMEs through business intelligence dashboards and POS integration. Seminar Nasional Inovasi Teknologi. https://doi.org/10.31294/snit.v1i1.8634
-
Syaputra, H., Apriyandi, D., Husin, M. N., & Desnelita, Y. (2023). Database consistency improvement in job offering system using normalization method. Jurnal Teknologi dan Open Source. https://doi.org/10.36378/jtos.v6i1.4022
-
Trisnawati, E., Susanto, T. D., & Handayani, P. W. (2024). Generating user personas for eliciting requirements using online data from point-of-sale patterns. Journal of Information Systems Engineering and Business Intelligence, 10(1), 110–125. https://doi.org/10.20473/jisebi.10.1.110-125
-
Vieira, R., Souza, L., Lima, M., & Costa, F. (2024). From DTO to ViewModel: Mapping strategies between layers in C# and Java. Leaders Tec. https://doi.org/10.5281/zenodo.11186754
-
Yalamati, S. S. A. (2025). Resilient microservice patterns using Java 17 and Spring Boot 3.2 in cloud-native systems. International Journal of Science and Research Archive. https://doi.org/10.30574/ijsra.2025.16.3.2559

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
1. Copyright Retention and Open Access License
Authors retain copyright of their work and grant the journal non-exclusive right of first publication under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
This license allows unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
2. Rights Granted Under CC BY 4.0
Under this license, readers are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, including commercial use
- No additional restrictions — the licensor cannot revoke these freedoms as long as license terms are followed
3. Attribution Requirements
All uses must include:
- Proper citation of the original work
- Link to the Creative Commons license
- Indication if changes were made to the original work
- No suggestion that the licensor endorses the user or their use
4. Additional Distribution Rights
Authors may:
- Deposit the published version in institutional repositories
- Share through academic social networks
- Include in books, monographs, or other publications
- Post on personal or institutional websites
Requirement: All additional distributions must maintain the CC BY 4.0 license and proper attribution.
5. Self-Archiving and Pre-Print Sharing
Authors are encouraged to:
- Share pre-prints and post-prints online
- Deposit in subject-specific repositories (e.g., arXiv, bioRxiv)
- Engage in scholarly communication throughout the publication process
6. Open Access Commitment
This journal provides immediate open access to all content, supporting the global exchange of knowledge without financial, legal, or technical barriers.