Statistical forecasting in complex contests: sales under promotions in large-scale retail trade

Using ANN, RBF, SVM

**The Project**

This project is concerned with sales forecasting of a given commodity in a retail store of large distribution. In past times managers of these stores used their experience to predict the daily sales and to decide the resupply quantities. In more recent years, with the development of computer aided decision making, the use of mathematical methods has became more widespread. In years 70s and 80s the principal methods used were statistical methods based on time series autoregressive models, like the ARIMA method, the Box-Jenkins’ method and the Winter’s exponential smoothing method. These methods perform the forecast by processing as input data samples of the same time series that one wants to forecast. If we consider the forecast as the output of the process, we can say that for these methods the input and the output pertain to the same time series.

Since the 90s the new mathematical model of Artificial Neural Network (ANN) was developed and employed also for forecasting applications. A neural network bases its prediction not only on samples of the time series that one wants to forecast, but also on samples of other input time series on which the output forecast may depend. These input series are called attributes of the output. The basic structure of an ANN is a multi-layer network of neurons, characterized by an activation function that depends on some parameters. Neurons are connected by weighted arcs. By properly tuning the parameters and weights, the ANN may become able to perform a forecast. An alternative characterization of the artificial neuron of an ANN is obtained, rather than in terms of activation functions, in terms of radial basis functions (RBF). By the end of 90s, a mathematical model different than ANN was also developed for classification and forecasting, named Support Vector Machine (SVM). The analytical roots of SVM are in the Statistical Learning Theory, the algorithmic roots for its training are in the duality theory of Mathematical Programming. Also the SVM performs the forecast using samples of the time series that one wants to forecast, as well as samples of other input attributes, and since its introduction the SVM has been considered a valid competitor of the ANN in the same fields of application. Multi-layer ANN, RBF ANN and SVM are tools for machine learning methods, based on a training process that, using given sets of input and output data, enables to forecast outputs corresponding to sets of input data not used for training. In all cases the training process is performed by solving mathematical optimization problems. In this way the machine learning method provides a surrogate model of a complex unknown phenomenon.

In this project the complex phenomenon of concern is how the amount of sales of a given commodity depends on different suitable input attributes, and in particular on an abnormal input attribute, that is occurrence of promotions on sales. While there exists a wide literature on sales forecasting, the effect of promotions seems to be neglected, unless for some consumer behavioral aspects. On the other hand, it is of main importance in the marketing practice.

This project is concerned with sales forecasting in a retail store of large distribution, with the aim to assess the relative effectiveness of the three kinds of learning machines considered before, using the real data of sales of a consumer good sold in two retail stores. The data cover a time period of five years: in the machine learning procedure the data of four years have been used for training and validation, the data of the fifth year for forecasting. As input attributes have been used calendar data: month, day on the month, day of the week, since is to be expected that the amount of sales are greater in the week-end at the end of the month when salaries are payed. Other input attributes have been, for each day: the price of the good, the number of customers entering the store, the number of opening hours, the presence or not of promotion.

All three kinds of Machine Learning procedure have been experimented, and compared both among them and with the traditional statistical methods. As a result the SVM method resulted to be the most performant.

In the figures are shown the forecasts in the periods January-April and September-December, where the promotions time-slots are in evidence.

## References

An application of Machine Learning to sales forecasting under promotions

An application of support vector machines to sales forecasting under promotions