Using Facebook’s Prophet library to forecast the FTSE 100 index.
By Dan Lantos
Photo by Maxim Hopman on Unsplash
This article (part of a short series) aims to introduce the Prophet library, discuss it at a high level and run through a basic example of forecasting the FTSE 100 index. Future articles will discuss exactly how Prophet achieves its results, how to interpret the output and how to improve the model.
Please see this article (by my talented colleague Gavita) for an introduction to time-series forecasting algorithms.
Prophet is an open-source time-series forecasting library developed by Facebook’s Core Data Science team.
The standard (and simplest) implementation uses a univariate model, where only one variable, time, is used to forecast results.
The forecast is achieved as below:
y(t) is the target variable, the value that is being predicted
g(t) is the trend term, one of two models — “nonlinear, saturating growth” or “linear trend with changepoints”.
s(t) is the season term, and will vary depending upon the periodicity of the data (intra-daily, weekly and yearly seasonalities).
h(t) is the holidays term, Prophet allows for custom holidays (and ranges either side) that may impact the model.
ε is our error term, these are assumed to be normally distributed random variables.
Don’t worry too much about these for now, they will be covered in more depth in future articles, but a high-level understanding of these things is always helpful.
One of the big benefits of Prophet is the minimal setup. All we require to use Prophet is a pandas dataframe, with 2 columns, “ds” our datestamp, and “y” our target variable.
Below is a code block sorting out the initial config, fetching the FTSE 100 ticker information from yfinance and plotting the dataset.
Next we perform a tiny bit of data wrangling to squish the dataframe into the shape Prophet is expecting.
We now have the required dataframe to apply our Prophet model, super simple to set-up!
Next we will define our model, and fit it to our dataset. For the purpose of this article, we will leave EVERY hyperparameter (the parameters used in the training of the model) as default, to showcase the “out-of-the-box” solution.
The above plot shows the actual (historical) data points in black, the actual (“future”) data points in red, the Prophet forecast as the blue line, and the lower-upper banding around the forecast in light blue.
As you can observe, the model does a generally good job of fitting the historical data, bar a few outliers, and has incredibly good predictive accuracy when plotted against the “future” actuals, at least by eye.
But how successful was the model? There is the need to define some metrics to assess the performance, MAPE (Mean Absolute Percentage Error) has been used here as it provides a user-friendly error metric, in percentage terms.
The results show an MAPE of 1.06% — an amazingly low figure for such an unrefined model!
Unreasonably high accuracy is normally a cause for concern, so Prophet’s cross validation tools are used here to investigate further.
These functions allow the creation of “simulated historical forecasts” where we validate our results on subsets of the training data.
This is achieved by truncating the training dataset at each forecast point, training the model, and predicting over a horizon, before validating the results against the actuals, and repeating over consecutive intervals.
The code below uses this cross validation approach, taking 3 years of initial data to begin with (initial = ‘1095 days’), forecasting 180 days in advance (horizon = ‘180 days’) and repeating this process every 90 days (period = ’90 days’).
The plot above shows a clear trend of increasing MAPE as the horizon length increases, an insight we could intuit.
The cross validation MAPE came out at 11.66% in this ensemble of forecasts. It’s interesting to note that the MAPE of 1.06% in the FTSE 100 forecast was outrageously low for the “typical” model on that horizon. So it’s a good job we didn’t get too excited about the performance of the model!
In this article, we looked at Prophet from a very high level and both implemented and evaluated a simplistic model.
The next article in this series will take a deeper look at hyperparameter tuning and getting “under-the-hood” of the model and formulate how these forecasts are created.