5.1 A tidy forecasting workflow

The process of producing forecasts for time series data can be broken down into a few steps.

To illustrate the process, we will fit linear trend models to national GDP data stored in global_economy.

Data preparation (tidy)

The first step in forecasting is to prepare data in the correct format. This process may involve loading in data, identifying missing values, filtering the time series, and other pre-processing tasks. The functionality provided by tsibble and other packages in the tidyverse substantially simplifies this step.

Many models have different data requirements; some require the series to be in time order, others require no missing values. Checking your data is an essential step to understanding its features and should always be done before models are estimated.

We will model GDP per capita over time; so first, we must compute the relevant variable.

gdppc <- global_economy %>%
mutate(GDP_per_capita = GDP / Population)

Plot the data (visualise)

As we have seen in Chapter 2, visualisation is an essential step in understanding the data. Looking at your data allows you to identify common patterns, and subsequently specify an appropriate model.

The data for one country in our example are plotted in Figure 5.1.

gdppc %>%
filter(Country == "Sweden") %>%
autoplot(GDP_per_capita) +