## 7.7 Measuring strength of trend and seasonality

A time series decomposition can be used to measure the strength of trend and seasonality in a time series (Wang, Smith, & Hyndman, 2006). Recall that the decomposition is written as $y_t = T_t + S_{t} + R_t,$ where $$T_t$$ is the smoothed trend component, $$S_{t}$$ is the seasonal component and $$R_t$$ is a remainder component. For strongly trended data, the seasonally adjusted data should have much more variation than the remainder component. Therefore Var$$(R_t)$$/Var$$(T_t+R_t)$$ should be relatively small. But for data with little or no trend, the two variances should be approximately the same. So we define the strength of trend as: $F_T = \max\left(0, 1 - \frac{\text{Var}(R_t)}{\text{Var}(T_t+R_t)}\right).$ This will give a measure of the strength of the trend between 0 and 1. Because the variance of the remainder might occasionally be even larger than the variance of the seasonally adjusted data, we set the minimal possible value of $$F_T$$ equal to zero.

The strength of seasonality is defined similarly, but with respect to the detrended data rather than the seasonally adjusted data: $F_S = \max\left(0, 1 - \frac{\text{Var}(R_t)}{\text{Var}(S_{t}+R_t)}\right).$ A series with seasonal strength $$F_S$$ close to 0 exhibits almost no seasonality, while a series with strong seasonality will have $$F_S$$ close to 1 because Var$$(R_t)$$ will be much smaller than Var$$(S_{t}+R_t)$$.

These measures can be useful, for example, when there you have a large collection of time series, and you need to find the series with the most trend or the most seasonality. They can be computed using the features() function:

us_retail_employment %>%
features(Employed, feature_set(tags = "stl"))
#> # A tibble: 1 x 10
#>   Series_ID trend_strength seasonal_streng… seasonal_peak_y… seasonal_trough…
#>   <chr>              <dbl>            <dbl>            <dbl>            <dbl>
#> 1 CEU42000…          0.999            0.983                0                3
#> # … with 5 more variables: spikiness <dbl>, linearity <dbl>, curvature <dbl>,
#> #   stl_e_acf1 <dbl>, stl_e_acf10 <dbl>

### Bibliography

Wang, X., Smith, K. A., & Hyndman, R. J. (2006). Characteristic-based clustering for time series data. Data Mining and Knowledge Discovery, 13(3), 335–364. https://robjhyndman.com/publications/ts-clustering/