Published: 2026-04-01
AutoClusterAPI: A Lightweight Backend Framework for Automated Unsupervised Clustering Pipelines
DOI: 10.35870/ijsecs.v6i1.5997
Yoppy Yunhasnawa, Atif Windawati, Toga Aldila Cinderatama, Moch. Zawaruddin Abdullah, Elok Nur Hamdana
Article Metrics
- Scopus Citations
- Google Scholar
- Crossref Citations
- Semantic Scholar
- DataCite Metrics
-
If the link doesn't work, copy the DOI or article title for manual search (API Maintenance).
Abstract
This study presents AutoClusterAPI, a lightweight and extensible backend system designed to simplify and accelerate unsupervised clustering workflows. The system addresses a recurring problem in data analysis practice: many practitioners need rapid clustering capabilities but lack the programming or statistical background required to build complete pipelines from scratch. AutoClusterAPI provides an automated, endpoint-driven solution that allows users to perform every stage of clustering — from data loading and cleaning to feature preparation, algorithm execution, profiling, and visualization — through standard HTTP requests. The system is built using Python and the FastAPI web framework, supports eight clustering algorithms, and includes automated preprocessing alongside PCA-based visualization. Functional testing confirms that all endpoints behave correctly under both valid and invalid inputs, establishing the reliability of the system. A case study using a customer segmentation dataset further demonstrates its practical utility, showing that AutoClusterAPI can efficiently generate meaningful cluster structures and interpretable visual outputs. The system offers an accessible yet configurable environment for rapid clustering analysis and establishes a basis for future extensions and real-world deployment.
Keywords
Clustering Pipeline; Backend Automation; FastAPI; Unsupervised Learning; PCA Visualization; Data Preprocessing; API-Driven Analytic
Peer Review Process
This article has undergone a double-blind peer review process to ensure quality and impartiality.
Indexing Information
Discover where this journal is indexed at our indexing page.
Open Science Badges
This journal supports transparency in research and encourages authors to meet criteria for Open Science Badges.
How to Cite
Article Information
This article has been peer-reviewed and published in the International Journal Software Engineering and Computer Science (IJSECS). The content is available under the terms of the Creative Commons Attribution 4.0 International License.
-
Issue: Vol. 6 No. 1 (2026)
-
Section: Articles
-
Published: 2026-04-01
-
License: CC BY 4.0
-
Copyright: © 2026 Authors
-
DOI: 10.35870/ijsecs.v6i1.5997
AI Research Hub
This article is indexed and available through various AI-powered research tools and citation platforms. Our AI Research Hub ensures that scholarly work is discoverable, accessible, and easily integrated into the global research ecosystem.
Yoppy Yunhasnawa, State Polytechnic of Malang
Politeknik Negeri Malang, Malang City, East Java Province, Indonesia
Atif Windawati, Politeknik Negeri Semarang
Politeknik Negeri Semarang, Semarang City, Central Java Province, Indonesia
Toga Aldila Cinderatama, State Polytechnic of Malang
Politeknik Negeri Malang, Malang City, East Java Province, Indonesia
Moch. Zawaruddin Abdullah, State Polytechnic of Malang
Politeknik Negeri Malang, Malang City, East Java Province, Indonesia
-
Aggarwal, C. C. (2016). An introduction to recommender systems. In Recommender systems: The textbook (pp. 1-28). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-29659-3_1
-
Agrawal, K. P., Garg, S., Sharma, S., & Patel, P. (2016). Development and validation of OPTICS based spatio-temporal clustering technique. Information Sciences, 369, 388-401. https://doi.org/10.1016/j.ins.2016.06.048.
-
Ahmed, M., Seraj, R., & Islam, S. M. S. (2020). The k-means algorithm: A comprehensive survey and performance evaluation. Electronics, 9(8), 1295. https://doi.org/10.3390/electronics9081295
-
Alla, M. (2025). Designing High-Throughput FastAPI Gateways for Microservice Communication. Journal of Computer Science and Technology Studies, 7(7), 823-828. https://doi.org/10.32996/jcsts.2025.7.7.88.
-
-
-
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3), 1-58. https://doi.org/10.1145/1541880.1541882.
-
Demšar, J., Curk, T., Erjavec, A., Gorup, Č., Hočevar, T., Milutinovič, M., Možina, M., Polajnar, M., Toplak, M., Starič, A., Štajdohar, M., Umek, L., Žagar, L., Žbontar, J., Žitnik, M., & Zupan, B. (2013). Orange: Data mining toolbox in Python. Journal of Machine Learning Research, 14(1), 2349–2353.
-
Deng, D. (2020, September). DBSCAN clustering algorithm based on density. In 2020 7th international forum on electrical engineering and automation (IFEEA) (pp. 949-953). IEEE. https://doi.org/10.1109/IFEEA51475.2020.00199.
-
-
Franciska, I., & Swaminathan, B. (2017, May). Churn prediction analysis using various clustering algorithms in KNIME analytics platform. In 2017 Third International Conference on Sensing, Signal Processing and Security (ICSSS) (pp. 166-170). IEEE. https://doi.org/10.1109/SSPS.2017.8071585.
-
Garg, S., Ahuja, R., Singh, R., & Perl, I. (2024). An effective deep learning architecture leveraging BIRCH clustering for resource usage prediction of heterogeneous machines in cloud data center. Cluster Computing, 27(5), 5699-5719. https://doi.org/10.1007/s10586-023-04258-6.
-
-
Greenacre, M., Groenen, P. J., Hastie, T., d’Enza, A. I., Markos, A., & Tuzhilina, E. (2022). Principal component analysis. Nature Reviews Methods Primers, 2(1), 100. https://doi.org/10.1038/s43586-022-00184-w.
-
Hajihosseinlou, M., Maghsoudi, A., & Ghezelbash, R. (2024). Intelligent mapping of geochemical anomalies: Adaptation of DBSCAN and mean-shift clustering approaches. Journal of Geochemical Exploration, 258, 107393. https://doi.org/10.1016/j.gexplo.2024.107393.
-
Kablan, M., Caldwell, B., Han, R., Jamjoom, H., & Keller, E. (2015). Stateless network functions. In Proceedings of the 2015 ACM SIGCOMM Workshop on Hot Topics in Middleboxes and Network Function Virtualization (pp. 49–54). ACM. https://doi.org/10.1145/2785989.2785993
-
-
Li, Y., Huang, J., & Liu, W. (2016, February). Scalable sequential spectral clustering. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1). https://doi.org/10.1609/aaai.v30i1.10298.
-
-
Martin-Lopez, A., Segura, S., & Ruiz-Cortés, A. (2021, July). RESTest: automated black-box testing of RESTful web APIs. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (pp. 682-685). https://doi.org/10.1145/3460319.3469082.
-
-
Pourkamali-Anaraki, F. (2020). Scalable spectral clustering with Nyström approximation: Practical and theoretical aspects. IEEE Open Journal of Signal Processing, 1, 242-256. https://doi.org/10.1109/OJSP.2020.3039330.
-
Raya-Tapia, A. Y., López-Flores, F. J., Ramírez-Márquez, C., & Ponce-Ortega, J. M. (2025). Programming for Clustering: Python, R, and MATLAB. In Machine Learning and Clustering for a Sustainable Future: Applications in Engineering and Environmental Science (pp. 51-99). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-032-03876-0_3.
-
Tang, C., Li, Z., Wang, J., Liu, X., Zhang, W., & Zhu, E. (2022). Unified one-step multi-view spectral clustering. IEEE Transactions on Knowledge and Data Engineering, 35(6), 6449-6460. https://doi.org/10.1109/TKDE.2022.3172687.
-
Tokuda, E. K., Comin, C. H., & Costa, L. D. F. (2022). Revisiting agglomerative clustering. Physica A: Statistical mechanics and its applications, 585, 126433. https://doi.org/10.1016/j.physa.2021.126433.
-
-
vetrirah. (2025). Customer — Customer segmentation dataset [Dataset]. Kaggle. https://www.kaggle.com/datasets/vetrirah/customer
-
Wu, L., Chen, P. Y., Yen, I. E. H., Xu, F., Xia, Y., & Aggarwal, C. (2018, July). Scalable spectral clustering using random binning features. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 2506-2515). https://doi.org/10.1145/3219819.3220090.
-
Xu, D., & Tian, Y. (2015). A comprehensive survey of clustering algorithms. Annals of data science, 2(2), 165-193. https://doi.org/10.1007/s40745-015-0040-1.
-
Yang, M. S., Lai, C. Y., & Lin, C. Y. (2012). A robust EM clustering algorithm for Gaussian mixture models. Pattern recognition, 45(11), 3950-3961. https://doi.org/10.1016/j.patcog.2012.04.031.
-
Yuan, B. (2025, May). Enhanced Supply Chain Risk Management Using K-Means Clustering and Tableau: A Hybrid Framework for Big Data Visualization. In 2025 2nd International Conference on Intelligent Computing and Robotics (ICICR) (pp. 509-513). IEEE. https://doi.org/10.1109/ICICR65456.2025.00094.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
1. Copyright Retention and Open Access License
Authors retain copyright of their work and grant the journal non-exclusive right of first publication under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
This license allows unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
2. Rights Granted Under CC BY 4.0
Under this license, readers are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, including commercial use
- No additional restrictions — the licensor cannot revoke these freedoms as long as license terms are followed
3. Attribution Requirements
All uses must include:
- Proper citation of the original work
- Link to the Creative Commons license
- Indication if changes were made to the original work
- No suggestion that the licensor endorses the user or their use
4. Additional Distribution Rights
Authors may:
- Deposit the published version in institutional repositories
- Share through academic social networks
- Include in books, monographs, or other publications
- Post on personal or institutional websites
Requirement: All additional distributions must maintain the CC BY 4.0 license and proper attribution.
5. Self-Archiving and Pre-Print Sharing
Authors are encouraged to:
- Share pre-prints and post-prints online
- Deposit in subject-specific repositories (e.g., arXiv, bioRxiv)
- Engage in scholarly communication throughout the publication process
6. Open Access Commitment
This journal provides immediate open access to all content, supporting the global exchange of knowledge without financial, legal, or technical barriers.