A Case Study of the Influence of Multifarious Factors on Traffic Flow Forecasting

作者

DOI:

https://doi.org/10.65455/sbd3t486

关键词:

Traffic Flow Forecasting, Multi-Source Data Fusion, Principal Component Analysis, Random Forest, SHAPley Additive ExPlanations

摘要

Accurate traffic flow forecasting serves as the core foundation for intelligent transport systems to achieve efficient traffic management and optimized resource allocation, holding significant importance for alleviating congestion in modern cities. Influenced by the complex interplay of diverse heterogeneous factors such as meteorological conditions, temporal cycles, and unforeseen events, urban traffic flow data exhibits pronounced nonlinearity and random fluctuations, rendering high-precision forecasting exceptionally challenging. While mainstream deep learning models have advanced prediction accuracy, they struggle to quantify the specific contributions of different factors and often overlook significant multicollinearity issues among multi-source data. Addressing these challenges, this paper introduces several improvements: Firstly, it constructs a fused dataset incorporating multidimensional external factors such as meteorological conditions and events; secondly, it proposes an explainable prediction framework based on Random Forest with raw feature analysis by Principal Component Analysis (PCA); Thirdly, the SHAP game-theoretic method is introduced to achieve transparent attribution of prediction results. This paper first employs PCA to extract principal components from high-dimensional multi-source factors, effectively eliminating multicollinearity in the data. Subsequently, a robust random forest regression model is constructed for prediction. Based on four independent datasets from Beijing's TaxiBJ service, the proposed framework undergoes comprehensive performance validation and analysis. Results demonstrate that the model maintains a coefficient of determination consistently above 0.85 across all annual datasets, exhibiting outstanding predictive accuracy and robustness across temporal cycles. SHAP analysis further reveals a stable decision mechanism characterized by the first principal component driving periodicity, with subsequent components providing dynamic fine-tuning, successfully achieving traffic flow prediction that combines high precision with strong interpretability. 

参考

[1]LIN W, SONG Y, LIU Y, et al. Constructing multimodal wireless knowledge graphs for large language model–based network reasoning. Applied Artificial Intelligence Research, 2026, 2(2). DOI: https://doi.org/10.65455/1bt00146

[2]YIN X, WU G, WEI J, et al. Deep learning on traffic prediction: Methods, analysis, and future directions. IEEE Transactions on Intelligent Transportation Systems, 2021, 23(6): 4927–4943. DOI: https://doi.org/10.1109/TITS.2021.3054840

[3]XU Z, LV Z, CHU B, et al. Progress and prospects of future urban health status prediction. Engineering Applications of Artificial Intelligence, 2024, 129: 107573. DOI: https://doi.org/10.1016/j.engappai.2023.107573

[4]XU Z, LV Z, LI J, et al. A novel approach for predicting water demand with complex patterns based on ensemble learning. Water Resources Management, 2022, 36(11): 4293–4312. DOI: https://doi.org/10.1007/s11269-022-03255-5

[5]LV Z, LI J, LI H, et al. Blind travel prediction based on obstacle avoidance in indoor scene. Wireless Communications and Mobile Computing, 2021, 2021(1): 5536386. DOI: https://doi.org/10.1155/2021/5536386

[6]XIE T W, ZHANG X K, LIU X D, et al. Research on performance prediction model of wind turbine gearbox lubricating oil based on deep learning. Applied Artificial Intelligence Research, 2026, 2(1). DOI: https://doi.org/10.65455/2s6npy06

[7]RAMANA K, SRIVASTAVA G, KUMAR M R, et al. A vision transformer approach for traffic congestion prediction in urban areas. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(4): 3922–3934. DOI: https://doi.org/10.1109/TITS.2022.3233801

[8]LI H, LI J, LV Z, et al. MFAGCN: multi-feature based attention graph convolutional network for traffic prediction//International Conference on Wireless Algorithms, Systems, and Applications. Cham: Springer International Publishing, 2021: 227–239. DOI: https://doi.org/10.1007/978-3-030-85928-2_18

[9]SUN H, LV Z, LI J, et al. Prediction of cancellation probability of online car-hailing orders based on multi-source heterogeneous data fusion//International Conference on Wireless Algorithms, Systems, and Applications. Cham: Springer Nature Switzerland, 2022: 168–180. DOI: https://doi.org/10.1007/978-3-031-19214-2_14

[10]LI Q, XU P, HE D, et al. Multi-source information fusion graph convolution network for traffic flow prediction. Expert Systems with Applications, 2024, 252: 124288. DOI: https://doi.org/10.1016/j.eswa.2024.124288

[11]LI H, LV Z, LI J, et al. Traffic flow forecasting in the covid-19: A deep spatial-temporal model based on discrete wavelet transformation. ACM Transactions on Knowledge Discovery from Data, 2023, 17(5): 1–28. DOI: https://doi.org/10.1145/3564753

[12]BREIMAN L. Random forests. Machine Learning, 2001, 45(1): 5–32. DOI: https://doi.org/10.1023/A:1010933404324

[13]BREIMAN L. Bagging predictors. Machine Learning, 1996, 24(2): 123–140. DOI: https://doi.org/10.1023/A:1018054314350

[14]YAN H, LI J, CHU B, et al. HT-STNet: a hierarchical Tucker decomposition and spatio-temporal LSTM network for accurate and efficient shared mobility demand forecasting on sparse data. Applied Intelligence, 2025, 55(7): 631. DOI: https://doi.org/10.1007/s10489-025-06500-7

[15]XU Z, LV Z, LI J. Fast-TrafficNet: A hybrid model for efficient prediction of nonlinear traffic flow with sparse data. Chaos, Solitons & Fractals, 2025, 201: 117230. DOI: https://doi.org/10.1016/j.chaos.2025.117230

[16]LIU Z. Explainable machine learning for telecom customer churn prediction and actionable retention strategies. Applied Artificial Intelligence Research, 2026, 2(1). DOI: https://doi.org/10.65455/1z4v1627

[17]ZHANG J. Research on the evolution of AI copyright attribution mechanisms. Applied Artificial Intelligence Research, 2026, 2(2). DOI: https://doi.org/10.65455/4zd3jp23

[18]ZHANG J, ZHENG Y, QI D. Deep spatio-temporal residual networks for citywide crowd flows prediction//Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1). DOI: https://doi.org/10.1609/aaai.v31i1.10735

[19]DataFountain. Prediction of population density in key areas[Z/OL]. https://www.datafountain.cn/competitions/428/datasets. Accessed 7 March 2021.

[20]XU Z, LV Z, CHU B, et al. A fast matrix autoregression algorithm based on Tucker decomposition for online prediction of nonlinear real-time taxi-hailing demand without pre-training. Chaos, Solitons & Fractals, 2024, 189: 115660. DOI: https://doi.org/10.1016/j.chaos.2024.115660

[21]YE R, XU Z, PANG J. DDFM: A novel perspective on urban travel demand forecasting based on the ensemble empirical mode decomposition and deep learning//Proceedings of the 5th International Conference on Big Data Technologies. 2022: 373–379. DOI: https://doi.org/10.1145/3565291.3565351

[22]XU Z, LV Z, LI J, et al. A novel perspective on travel demand prediction considering natural environmental and socioeconomic factors. IEEE Intelligent Transportation Systems Magazine, 2022, 15(1): 136–159. DOI: https://doi.org/10.1109/MITS.2022.3162901

[23]XIAO H, ZOU B, XIAO J. Graph convolution networks based on adaptive spatiotemporal attention for traffic flow forecasting. Scientific Reports, 2025, 15(1): 8935. DOI: https://doi.org/10.1038/s41598-025-88706-w

[24]LUNDBERG S M, LEE S I. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 2017, 30.

[25]YUAN G, LI J, LV Z, et al. DDCAttNet: road segmentation network for remote sensing images//International Conference on Wireless Algorithms, Systems, and Applications. Cham: Springer International Publishing, 2021: 457–468. DOI: https://doi.org/10.1007/978-3-030-86130-8_36

##submission.downloads##

已出版

2026-06-10

##submission.dataAvailability##

The data that support the findings of this study are available from the corresponding author upon reasonable request.