Published: 2026-04-20
Revisiting Feature Scaling in Linear Regression: An Empirical Study on Microsoft Stock Price Prediction
DOI: 10.35870/ijsecs.v6i1.6873
Farhan Mahfudz, Khoirunnisya Khoirunnisya
Article Metrics
- Scopus Citations
- Google Scholar
- Crossref Citations
- Semantic Scholar
- DataCite Metrics
-
If the link doesn't work, copy the DOI or article title for manual search (API Maintenance).
Abstract
Stock price prediction occupies a central position in quantitative finance, bearing directly on risk management, portfolio construction, and investment decision-making. This study evaluated the effect of feature scaling on linear regression performance in predicting Microsoft (MSFT) stock prices. A quantitative experimental design was employed, drawing on historical MSFT stock data spanning 2014 to 2024. Preprocessing involved data cleaning, outlier treatment via the Interquartile Range (IQR) method, and feature standardization through Z-score normalization. Two experimental conditions were tested: linear regression without feature scaling and linear regression with feature scaling. Model performance was assessed using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R²). Both conditions produced nearly identical results — R² approaching 0.99, with negligible divergence across all error metrics. The evidence suggests that feature scaling does not meaningfully alter the predictive behavior of linear regression. For simple linear models operating without regularization, scaling appears to be an unnecessary preprocessing step, a finding that warrants more deliberate evaluation of preprocessing decisions in machine learning pipelines.
Keywords
Linear Regression; Feature Scaling; Stock Price Prediction
Peer Review Process
This article has undergone a double-blind peer review process to ensure quality and impartiality.
Indexing Information
Discover where this journal is indexed at our indexing page.
Open Science Badges
This journal supports transparency in research and encourages authors to meet criteria for Open Science Badges.
How to Cite
Article Information
This article has been peer-reviewed and published in the International Journal Software Engineering and Computer Science (IJSECS). The content is available under the terms of the Creative Commons Attribution 4.0 International License.
-
Issue: Vol. 6 No. 1 (2026)
-
Section: Articles
-
Published: 2026-04-20
-
License: CC BY 4.0
-
Copyright: © 2026 Authors
-
DOI: 10.35870/ijsecs.v6i1.6873
AI Research Hub
This article is indexed and available through various AI-powered research tools and citation platforms. Our AI Research Hub ensures that scholarly work is discoverable, accessible, and easily integrated into the global research ecosystem.
Farhan Mahfudz, Universitas Pamulang
Department of Informatics Engineering, Universitas Pamulang, South Tangerang City, Banten Province, Indonesia
-
Aburto, L., Romero-Romero, R., Linfati, R., & Escobar, J. W. (2023). An approach for a multi-period portfolio selection problem by considering transaction costs and prediction on the stock market. Complexity, 2023, 1–15. https://doi.org/10.1155/2023/3056411
-
Ahmed, W. S., Mehmood, A., Sheikh, T., & Bachaya, A. (2022). Unveiling the linkages between emerging stock market indices and cryptocurrencies. Asian Academy of Management Journal, 27(2). https://doi.org/10.21315/aamj2022.27.2.9
-
Alka, T. A., Raman, R., & Suresh, M. (2025). Critical success factors for successful technology innovation development in sustainable energy enterprises. Scientific Reports, 15(1), Article 14138. https://doi.org/10.1038/s41598-025-98725-2
-
Azman, S., Pathmanathan, D., & Thavaneswaran, A. (2022). Forecasting the volatility of cryptocurrencies in the presence of COVID-19 with the state space model and Kalman filter. Mathematics, 10(17), Article 3190. https://doi.org/10.3390/math10173190
-
Bagheri, F., Recupero, D. R., & Sirnes, E. (2023). Leveraging return prediction approaches for improved value-at-risk estimation. Data, 8(8), Article 133. https://doi.org/10.3390/data8080133
-
Bang, C. G. (2024). Data-driven decision-making for business. Routledge. https://doi.org/10.4324/9781003457787
-
Bassam, A. M., Phillips, A. B., Turnock, S. R., & Wilson, P. A. (2022). Ship speed prediction based on machine learning for efficient shipping operation. Ocean Engineering, 245, Article 110449. https://doi.org/10.1016/j.oceaneng.2021.110449
-
Bian, L., Qin, X., Zhang, C., Guo, P., & Wu, H. (2023). Application, interpretability and prediction of machine learning method combined with LSTM and LightGBM: A case study for runoff simulation in an arid area. Journal of Hydrology, 625, Article 130091. https://doi.org/10.1016/j.jhydrol.2023.130091
-
Cândido, J., Haesen, J., Aniche, M., & van Deursen, A. (2021). An exploratory study of log placement recommendation in an enterprise system. Proceedings of the 18th International Conference on Mining Software Repositories (MSR), 143–154. https://doi.org/10.1109/msr52588.2021.00027
-
Chang, V., Xu, Q. A., Chidozie, A., & Wang, H. (2024). Predicting economic trends and stock market prices with deep learning and advanced machine learning techniques. Electronics, 13(17). https://doi.org/10.3390/electronics13173396
-
Cristescu, M. P., Mara, D. A., Nerișanu, R. A., Culda, L. C., & Maniu, I. (2023). Analyzing the impact of financial news sentiments on stock prices: A wavelet correlation. Mathematics, 11(23). https://doi.org/10.3390/math11234830
-
Dani, Y., Belouaggadia, N., & Jammoukh, M. (2025). Predicting CO2 emissions in Morocco: Exploring the use of ridge regression with data preprocessing and feature impact analysis. Environmental Science and Pollution Research, 32(45), 25618–25642. https://doi.org/10.1007/s11356-025-37156-y
-
DeMiguel, V., Gil-Bazo, J., Nogales, F. J., & Santos, A. A. P. (2021). Can machine learning help to select portfolios of mutual funds? SSRN. https://doi.org/10.2139/ssrn.3768753
-
Fang, F., Ventre, C., Basios, M., Kanthan, L., Martínez-Rego, D., Wu, F., & Li, L. (2022). Cryptocurrency trading: A comprehensive survey. Financial Innovation, 8(1). https://doi.org/10.1186/s40854-021-00321-6
-
Goel, H., & Som, B. K. (2023). Stock market prediction, COVID-19 pandemic and neural networks: An SCG algorithm application. Economia, 24(1), 134–146. https://doi.org/10.1108/econ-07-2022-0101
-
Gyaneshwar, A., Mishra, A., Chadha, U., Vincent, P. M. D. R., Rajinikanth, V., Ganapathy, G. P., & Srinivasan, K. (2023). A contemporary review on deep learning models for drought prediction. Sustainability, 15(7), Article 6160. https://doi.org/10.3390/su15076160
-
Habib, M., & Okayli, M. (2024). Evaluating the sensitivity of machine learning models to data preprocessing technique in concrete compressive strength estimation. Arabian Journal for Science and Engineering, 49(10), 13709–13727. https://doi.org/10.1007/s13369-024-08776-2
-
Hassan, R., & Baghban, A. (2025). Pioneering machine learning techniques to estimate thermal conductivity of carbon-based phase change materials: A comprehensive modeling framework. Case Studies in Thermal Engineering, 73, Article 106648. https://doi.org/10.1016/j.csite.2025.106648
-
Hollmann, N., Müller, S., Purucker, L., Krishnakumar, A., Körfer, M., Hoo, S. B., Schirrmeister, R. T., & Hutter, F. (2025). Accurate predictions on small data with a tabular foundation model. Nature, 637(8045), 319–326. https://doi.org/10.1038/s41586-024-08328-6
-
Jiang, H., Dong, Y., & Wang, J. (2024). Electricity price forecasting using quantile regression averaging with nonconvex regularization. Journal of Forecasting, 43(6), 1859–1879. https://doi.org/10.1002/for.3103
-
Khatoon, U. T., & Velidandi, A. (2025). An overview on the role of government initiatives in nanotechnology innovation for sustainable economic development and research progress. Sustainability, 17(3). https://doi.org/10.3390/su17031250
-
Khosravi, B., Weston, A. D., Nugen, F., Mickley, J. P., Maradit Kremers, H., Wyles, C. C., Carter, R. E., & Taunton, M. J. (2023). Demystifying statistics and machine learning in analysis of structured tabular data. The Journal of Arthroplasty, 38(10), 1943–1947. https://doi.org/10.1016/j.arth.2023.08.045
-
Ko, K., Al Doulat, A., & Ku, S. (2025). Ridge, LASSO, and elastic net regression: Fathers' psychological well-being in familial contexts. Journal of Social and Personal Relationships. Advance online publication. https://doi.org/10.1177/02654075251412474
-
Li, Q. (2021). The use of artificial intelligence combined with cloud computing in the design of education information management platform. International Journal of Emerging Technologies in Learning (iJET), 16(5), 32. https://doi.org/10.3991/ijet.v16i05.20309
-
Moro-Visconti, R., Rambaud, S. C., & Pascual, J. L. (2020). Sustainability in FinTechs: An explanation through business model scalability and market valuation. Sustainability, 12(24). https://doi.org/10.3390/su122410316
-
Munkhdalai, L., Munkhdalai, T., Pham, V.-H., Hong, J., Ryu, K. H., & Theera-Umpon, N. (2022). Neural network-augmented locally adaptive linear regression model for tabular data. Sustainability, 14(22), Article 15273. https://doi.org/10.3390/su142215273
-
Nagaraju, T. V., Mantena, S., Azab, M., Alisha, S. S., El Hachem, C., Adamu, M., & Rama Murthy, P. S. (2023). Prediction of high strength ternary blended concrete containing different silica proportions using machine learning approaches. Results in Engineering, 17, Article 100973. https://doi.org/10.1016/j.rineng.2023.100973
-
Panovski, D., & Zaharia, T. (2020). Long and short-term bus arrival time prediction with traffic density matrix. IEEE Access, 8, 226267–226284. https://doi.org/10.1109/access.2020.3044173
-
Sakib, M., Mustajab, S., & Alam, M. (2024). Ensemble deep learning techniques for time series analysis: A comprehensive review, applications, open issues, challenges, and future directions. Cluster Computing, 28(1), 73. https://doi.org/10.1007/s10586-024-04684-0
-
Shah, K. N. (2025). Data-driven automation and AI/ML: Revolutionizing financial decision-making. In AI and automation in financial services (pp. 249–271). Springer. https://doi.org/10.1007/978-3-031-92916-8_13
-
Sheng, Y., & Ma, D. (2022). Stock index spot–futures arbitrage prediction using machine learning models. Entropy, 24(10), Article 1462. https://doi.org/10.3390/e24101462
-
Shi, X., Jiang, D., Qian, W., & Liang, Y. (2022). Application of the Gaussian process regression method based on a combined kernel function in engine performance prediction. ACS Omega, 7(45), 41732–41743. https://doi.org/10.1021/acsomega.2c05952
-
Shome, A., Mukherjee, G., Chatterjee, A., & Tudu, B. (2024). Study of different regression methods, models and application in deep learning paradigm. In Deep learning concepts in operations research (pp. 130–152). CRC Press. https://doi.org/10.1201/9781003433309-13
-
Sperber, C., Gallucci, L., Mirman, D., Arnold, M., & Umarova, R. M. (2023). Stroke lesion size: Still a useful biomarker for stroke severity and outcome in times of high-dimensional models. NeuroImage: Clinical, 40, Article 103511. https://doi.org/10.1016/j.nicl.2023.103511
-
Tian, Y., Lu, Z., Adriaens, P., Minchin, R. E., Caithness, A., & Woo, J. (2020). Finance infrastructure through blockchain-based tokenization. Frontiers of Engineering Management, 7(4), 485–499. https://doi.org/10.1007/s42524-020-0140-2
-
Tiwari, A. K., Abdullah, M., Sarker, P. K., & Abakah, E. J. A. (2025). Real-world asset tokens and commodities: Static and dynamic linkages. China Accounting and Finance Review, 27(5), 759–788. https://doi.org/10.1108/CAFR-05-2024-0054
-
Tjøstheim, D. (2025). Selected topics in time series forecasting: Statistical models vs. machine learning. Entropy, 27(3), Article 279. https://doi.org/10.3390/e27030279
-
Yang, N., & Zhou, W. (2024). Feature selection for explaining yellowfin tuna catch per unit effort using least absolute shrinkage and selection operator regression. Fishes, 9(6), Article 204. https://doi.org/10.3390/fishes9060204
-
Zhang, L., & Hua, L. (2025). Major issues in high-frequency financial data analysis: A survey of solutions. Mathematics, 13(3), Article 347. https://doi.org/10.3390/math13030347

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
1. Copyright Retention and Open Access License
Authors retain copyright of their work and grant the journal non-exclusive right of first publication under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
This license allows unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
2. Rights Granted Under CC BY 4.0
Under this license, readers are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, including commercial use
- No additional restrictions — the licensor cannot revoke these freedoms as long as license terms are followed
3. Attribution Requirements
All uses must include:
- Proper citation of the original work
- Link to the Creative Commons license
- Indication if changes were made to the original work
- No suggestion that the licensor endorses the user or their use
4. Additional Distribution Rights
Authors may:
- Deposit the published version in institutional repositories
- Share through academic social networks
- Include in books, monographs, or other publications
- Post on personal or institutional websites
Requirement: All additional distributions must maintain the CC BY 4.0 license and proper attribution.
5. Self-Archiving and Pre-Print Sharing
Authors are encouraged to:
- Share pre-prints and post-prints online
- Deposit in subject-specific repositories (e.g., arXiv, bioRxiv)
- Engage in scholarly communication throughout the publication process
6. Open Access Commitment
This journal provides immediate open access to all content, supporting the global exchange of knowledge without financial, legal, or technical barriers.