4.4 Other features

Many more features are possible, and the feasts package computes only a few dozen features that have proven useful in time series analysis. It is also easy to add your own features by writing an R function that takes a univariate time series input and returns a numerical vector containing the feature values.

The remaining features in the feasts package, not previously discussed, are listed here for reference. The details of some of them are discussed later in the book.

  • coef_hurst will calculate the Hurst coefficient of a time series which is a measure of “long memory”. A series with long memory will have significant autocorrelations for many lags.
  • feat_spectral will compute the (Shannon) spectral entropy of a time series, which is a measure of how easy the series is to forecast. A series which has strong trend and seasonality (and so is easy to forecast) will have entropy close to 0. A series that is very noisy (and so is difficult to forecast) will have entropy close to 1.
  • box_pierce gives the Box-Pierce statistic for testing if a time series is white noise, and the corresponding p-value. This test is discussed in Section 5.4.
  • ljung_box gives the Ljung-Box statistic for testing if a time series is white noise, and the corresponding p-value. This test is discussed in Section 5.4.
  • The \(k\)th partial autocorrelation measures the relationship between observations \(k\) periods apart after removing the effects of observations between them. So the first partial autocorrelation (\(k=1\)) is identical to the first autocorrelation, because there is nothing between consecutive observations to remove. Partial autocorrelations are discussed in Section 9.5. The feat_pacf function contains several features involving partial autocorrelations including the sum of squares of the first five partial autocorrelations for the original series, the first-differenced series and the second-differenced series. For seasonal data, it also includes the partial autocorrelation at the first seasonal lag.
  • unitroot_kpss gives the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) statistic for testing if a series is stationary, and the corresponding p-value. This test is discussed in Section 9.1.
  • unitroot_pp gives the Phillips-Perron statistic for testing if a series is non-stationary, and the corresponding p-value.
  • unitroot_ndiffs gives the number of differences required to lead to a stationary series based on the KPSS test. This is discussed in Section 9.1
  • unitroot_nsdiffs gives the number of seasonal differences required to make a series stationary. This is discussed in Section 9.1.
  • var_tiled_mean gives the variances of the “tiled means” (i.e., the means of consecutive non-overlapping blocks of observations). The default tile length is either 10 (for non-seasonal data) or the length of the seasonal period. This is sometimes called the “stability” feature.
  • var_tiled_var gives the variances of the “tiled variances” (i.e., the variances of consecutive non-overlapping blocks of observations). This is sometimes called the “lumpiness” feature.
  • shift_level_max finds the largest mean shift between two consecutive sliding windows of the time series. This is useful for finding sudden jumps or drops in a time series.
  • shift_level_index gives the index at which the largest mean shift occurs.
  • shift_var_max finds the largest variance shift between two consecutive sliding windows of the time series. This is useful for finding sudden changes in the volatility of a time series.
  • shift_var_index gives the index at which the largest variance shift occurs.
  • shift_kl_max finds the largest distributional shift (based on the Kulback-Leibler divergence) between two consecutive sliding windows of the time series. This is useful for finding sudden changes in the distribution of a time series.
  • shift_kl_index gives the index at which the largest KL shift occurs.
  • n_crossing_points computes the number of times a time series crosses the median.
  • longest_flat_spot computes the number of sections of the data where the series is relatively unchanging.
  • stat_arch_lm returns the statistic based on the Lagrange Multiplier (LM) test of Engle (1982) for autoregressive conditional heteroscedasticity (ARCH).
  • guerrero computes the optimal \(\lambda\) value for a Box-Cox transformation using the Guerrero method (discussed in Section 3.1).