4.4 Other features

Many more features are possible, and the feasts package computes only a few dozen features that have proven useful in time series analysis. It is also easy to add your own features by writing an R function which takes a univariate time series input and returns a numerical vector containing the feature values.

The remaining features in the feasts package, not previously discussed, are listed here for reference. The details of some of them are discussed later in the book.

  • coef_hurst will calculate the Hurst coefficient of a time series which is a measure of “long memory”. A series with long memory will have significant autocorrelations for many lags.
  • spectral_entropy will compute the (Shannon) spectral entropy of a time series, which is a measure of how easy the series is to forecast. A series which has strong trend and seasonality (and so is easy to forecast) will have entropy close to 0. A series that is very noisy (and so is difficult to forecast) will have entropy close to 1.
  • bp_stat gives the Box-Pierce statistic for testing if a time series is white noise, while bp_pvalue gives the p-value from that test. This test is discussed in Section 5.4.
  • lb_stat gives the Ljung-Box statistic for testing if a time series is white noise, while lb_pvalue gives the p-value from that test. This test is discussed in Section 5.4.
  • The \(k\)th partial autocorrelations measure the relationship between observations \(k\) periods apart after removing the effects of observations between them. So the first partial autocorrelation (\(k=1\)) is identical to the first autocorrelation, because there is nothing between them to remove. Partial autocorrelations are discussed in Section 9.5. The pacf5 feature contains the sum of squares of the first five partial autocorrelations.
  • diff1_pacf5 contains the sum of squares of the first five partial autocorrelations from the differenced data.
  • diff2_pacf5 contains the sum of squares of the first five partial autocorrelations from the differenced data.
  • season_pacf contains the partial autocorrelation at the first seasonal lag.
  • kpss_stat gives the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) statistic for testing if a series is stationary, while kpss_pvalue gives the p-value from that test. This test is discussed in Section 9.1.
  • pp_stat gives the Phillips-Perron statistic for testing if a series is non-stationary, while pp_pvalue gives the p-value from that test.
  • ndiffs gives the number of differences required to lead to a stationary series based on the KPSS test. This is discussed in Section 9.1
  • nsdiffs gives the number of seasonal differences required to make a series stationary. This is discussed in Section 9.1.
  • var_tiled_mean gives the variances of the “tiled means” (i.e., the means of consecutive non-overlapping blocks of observations). The default tile length is either 10 (for non-seasonal data) or the length of the seasonal period. This is sometimes called the “stability” feature.
  • var_tiled_var gives the variances of the “tiled variances” (i.e., the variances of consecutive non-overlapping blocks of observations). This is sometimes called the “lumpiness” feature.
  • shift_level_max finds the largest mean shift between two consecutive sliding windows of the time series. This is useful for finding sudden jumps or drops in a time series.
  • shift_level_index gives the index at which the largest mean shift occurs.
  • shift_var_max finds the largest variance shift between two consecutive sliding windows of the time series. This is useful for finding sudden changes in the volatility of a time series.
  • shift_var_index gives the index at which the largest mean shift occurs
  • shift_kl_max finds the largest distributional shift (based on the Kulback-Leibler divergence) between two consecutive sliding windows of the time series. This is useful for finding sudden changes in the distribution of a time series.
  • shift_kl_index gives the index at which the largest KL shift occurs.
  • n_crossing_points computes the number of times a time series crosses the median.
  • n_flat_spots computes the number of sections of the data where the series is relatively unchanging.
  • stat_arch_lm returns the statistic based on the Lagrange Multiplier (LM) test of Engle (1982) for autoregressive conditional heteroscedasticity (ARCH).