Published: 2026-04-30

Web Attack Detection for SQLi and XSS Using Ensemble Learning Based on Character-Level N-Gram Features

DOI: 10.35870/ijsecs.v6i1.7193

Cover IJSECS VOLUME 6 NOMOR 1 APRIL 2026
Article Metrics
Share:

Abstract

SQL Injection (SQLi) and Cross-Site Scripting (XSS) remain severe threats to web application security, particularly as attackers employ increasingly sophisticated obfuscation techniques to bypass conventional detection systems. This research constructs a machine learning framework using ensemble learning — specifically combining Random Forest and XGBoost — integrated with character-level n-gram feature extraction. The methodology involved rigorous data curation of a large-scale dataset, refining 156,636 raw samples into 151,783 unique entries to ensure high-quality training data. By extracting 10,000 character-level n-gram features, the model captures the intricate structural patterns of complex and obfuscated payloads. Experimental results show consistent and measurable performance: the proposed ensemble model achieved an overall accuracy of 99.67%. Stability was confirmed through a 5-fold cross-validation process, yielding a mean accuracy of 99.64% and a standard deviation of 0.0003. These findings are reinforced by ROC AUC scores of 1.0000 for XSS and 0.9999 for SQLi, indicating near-perfect discriminative capability. The combination of character-level representation and ensemble learning produces a precise and resilient solution for safeguarding modern web environments against dynamic and evolving cyber threats.

Keywords

Character-level N-gram; Ensemble Learning; SQL Injection; Web Security; XSS

Peer Review Process

This article has undergone a double-blind peer review process to ensure quality and impartiality.

Indexing Information

Discover where this journal is indexed at our indexing page.

Open Science Badges

This journal supports transparency in research and encourages authors to meet criteria for Open Science Badges.