10.5 Dynamic harmonic regression
When there are long seasonal periods, a dynamic regression with Fourier terms is often better than other models we have considered in this book.
For example, daily data can have annual seasonality of length 365, weekly data has seasonal period of approximately 52, while half-hourly data can have several seasonal periods, the shortest of which is the daily pattern of period 48.
Seasonal versions of ARIMA and ETS models are designed for shorter periods such as 12 for monthly data or 4 for quarterly data. The
ETS() model restricts seasonality to be a maximum period of 24 to allow hourly data but not data with a larger seasonal frequency. The problem is that there are \(m-1\) parameters to be estimated for the initial seasonal states where \(m\) is the seasonal period. So for large \(m\), the estimation becomes almost impossible.
ARIMA() function will allow a seasonal period up to \(m=350\), but in practice will usually run out of memory whenever the seasonal period is more than about 200. In any case, seasonal differencing of high order does not make a lot of sense — for daily data it involves comparing what happened today with what happened exactly a year ago and there is no constraint that the seasonal pattern is smooth.
So for such time series, we prefer a harmonic regression approach where the seasonal pattern is modelled using Fourier terms with short-term time series dynamics handled by an ARMA error.
The advantages of this approach are:
- it allows any length seasonality;
- for data with more than one seasonal period, Fourier terms of different frequencies can be included;
- the smoothness of the seasonal pattern can be controlled by \(K\), the number of Fourier sin and cos pairs – the seasonal pattern is smoother for smaller values of \(K\);
- the short-term dynamics are easily handled with a simple ARMA error.
The only real disadvantage (compared to a seasonal ARIMA model) is that the seasonality is assumed to be fixed — the seasonal pattern is not allowed to change over time. But in practice, seasonality is usually remarkably constant so this is not a big disadvantage except for long time series.
Example: Australian eating out expenditure
In this example we demonstrate combining Fourier terms for capturing seasonality with ARIMA errors capturing other dynamics in the data. For simplicity, we will use an example with monthly data. The same modelling approach using weekly data is discussed in Section ??.
We use the total monthly expenditure on cafes, restaurants and takeaway food services in Australia ($billion) from 2004 up to 2018 and forecast 24 months ahead. We vary \(K\), the number of Fourier sin and cos pairs, from \(K=1\) to \(K=6\) (which is equivalent to including seasonal dummies). Figure 10.11 shows the seasonal pattern projected forward as \(K\) increases. Notice that as \(K\) increases the Fourier terms capture and project a more “wiggly” seasonal pattern and simpler ARIMA models are required to capture other dynamics. The AICc value is minimised for \(K=5\), with a significant jump going from \(K=4\) to \(K=5\), hence the forecasts generated from this model would be the ones used.
aus_cafe <- aus_retail %>% filter( Industry == "Cafes, restaurants and takeaway food services", year(Month) %in% 2004:2018 ) %>% summarise(Turnover = sum(Turnover)) fit <- aus_cafe %>% model( `K = 1` = ARIMA(log(Turnover) ~ fourier(K = 1) + PDQ(0,0,0)), `K = 2` = ARIMA(log(Turnover) ~ fourier(K = 2) + PDQ(0,0,0)), `K = 3` = ARIMA(log(Turnover) ~ fourier(K = 3) + PDQ(0,0,0)), `K = 4` = ARIMA(log(Turnover) ~ fourier(K = 4) + PDQ(0,0,0)), `K = 5` = ARIMA(log(Turnover) ~ fourier(K = 5) + PDQ(0,0,0)), `K = 6` = ARIMA(log(Turnover) ~ fourier(K = 6) + PDQ(0,0,0)) ) fit %>% forecast(h = "2 years") %>% autoplot(aus_cafe) + facet_wrap(vars(.model), ncol = 2) + guides(colour = FALSE) + geom_label( aes(x = yearmonth("2007 Jan"), y = 4250, label = paste0("AICc = ", format(AICc))), data = glance(fit) )