Forecasting Tobacco Sales Trends Based on Combined Xgboost-Lightgbm Modeling
DOI:
https://doi.org/10.54097/wzjqpb47Keywords:
Tobacco Sales, XGBoost-LightGBM, Combined Prediction Coefficient of Determination.Abstract
In the contemporary market landscape, consumers increasingly prioritize cigarette products' safety and pricing transparency. This study develops multiple time-series forecasting models to accurately predict five cigarette brands' sales volume and revenue. By integrating advanced data preprocessing techniques and feature engineering, the proposed framework addresses the challenges posed by extreme sales values and macroeconomic fluctuations. Considering that the sales in some months are "extremely large" or "extremely small", this paper introduces the Savitzky-Golay filtering algorithm for data smoothing. One of the major innovations of this paper is that in addition to product sales and average prices, other factors such as the rate of price change and household consumption index are also considered. In order to analyze the importance of the impact of different factors on sales, this paper introduces the SHAP algorithm to rank the degree of characteristics and finally selects the top three factors as the indicators of sales forecast. In this paper, two efficient gradient-boosting decision tree algorithms, XGBoost and LightGBM, are used for combined prediction, and the average of the two predictions is used as the final prediction result. The final MSE of the two cigarette brand predictions are 0.0042 and 0.0036, and the goodness of fit is 0.9056 and 0.9028, respectively, and this combination model has high accuracy.
Downloads
References
[1] Liang Shangjian,Wang Ying. Research on the Impact of the New Crown Epidemic on China's Equity and Bond Markets--An Empirical Analysis Based on the ARIMA Model [J]. China Securities and Futures, 2024, (04): 59-66+89.
[2] CHEN Hong, WANG Yi, ZHOU Qian. Macroeconomic impact and prediction of economic operation indicators of tobacco business enterprises--Taking N city tobacco company as an example [J]. Modern Enterprise, 2024, (07): 65-67.
[3] Guo Yuanbo. Reform and Innovation of Cigarette Marketing of Tobacco Industry Enterprises under the New Retail Mode [J]. Modern Enterprise Culture, 2024, (13): 38-40.
[4] Chang Guangming,Zhao Xia. Research on wheat price prediction based on LSTM model[J]. Software,2024,45(11):42-45.
[5] Deng-Yao Zhang. Comparison of futures price prediction effects based on BPNN and optimization methods[J]. Statistics and Decision Making,2024,40(23):161-166.
[6] Ding Yi,Liu Tao,Wang Zhenya. Adaptive weighted Savitzky-Golay filtering for early bearing failure feature extraction [J]. Manufacturing Technology and Machine Tools, 2024, (06): 58-66.
[7] You YQ, Li XT, Liu H, et al. Laser self-mixing interferometric microdisplacement reconstruction based on the combination of wavelet threshold filtering and S-G filtering[J]. Intense Laser and Particle Beam,2024,36(08):13-20.
[8] WANG Yuning,ZHOU Kai,SHEN Shoufeng. A state transfer prediction model based on XGBoost[J]. Journal of Zhejiang University of Technology,2024,52(03):275-279.
[9] TAN Jie, GUI Qianjun, LIAO Chaoyang, et al. Relationship between urban form and surface temperature based on XGBoost-SHAP interpretable machine learning model[J/OL]. Journal of Applied Ecology,1-13[2025-01-17].
[10] Shanshan Li, Chaoyang Sun, Guodong Li. Depth-interpretable machine learning model of cavity effusion based on LightGBM and SHAP [J]. Mechanics Quarterly, 2024, 45 (02): 442-453.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Journal of Education, Humanities and Social Sciences

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.