The Usage of Lasso-Regression for Optimal Selection of Exogenous Variables for SARIMAX Models
Keywords: Time series analysis, Time series forecast, SARIMAX model, Lasso-regression, Exogenous variables selection
Abstract. Time series forecasting and data gap filling are significant tasks in applied science. Nowadays there are a lot of different useful methods for forecasting missing gaps in the data. However, it should be taken into account that the inclusion of a large number of features may lead to overfitting of the model and a decrease in the forecast quality. This paper examines the problem of forecast error minimization in time series with exogenous variables, in which the missing gaps are forecasted by the SARIMAX model. The analysis of Tver region data showed that the selection of significant exogenous variables using Lasso-regression allows for minimization of the forecast error and prevents model overfitting. The obtained results confirm that the correct choice of exogenous variables significantly improves the forecast quality.