2.10 Exercises

  1. Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.

    • Use ? (or help()) to find out about the data in each series.
    • What is the time interval of each series?
    • Use autoplot() to produce a time plot of each series.
    • For the last plot, modify the axis labels and title.
  2. Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.

  3. Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

    1. You can read the data into R with the following script:

      tute1 <- readr::read_csv("tute1.csv")
      View(tute1)
    2. Convert the data to time series

      mytimeseries <- tute1 |>
        mutate(Quarter = yearquarter(Quarter)) |>
        as_tsibble(index = Quarter)
    3. Construct time series plots of each of the three series

      mytimeseries |>
        pivot_longer(-Quarter) |>
        ggplot(aes(x = Quarter, y = value, colour = name)) +
        geom_line() +
        facet_grid(name ~ ., scales = "free_y")

      Check what happens when you don’t include facet_grid().

  4. The USgas package contains data on the demand for natural gas in the US.

    1. Install the USgas package.
    2. Create a tsibble from us_total with year as the index and state as the key.
    3. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).
    1. Download tourism.xlsx from the book website and read it into R using readxl::read_excel().
    2. Create a tsibble which is identical to the tourism tsibble from the tsibble package.
    3. Find what combination of Region and Purpose had the maximum number of overnight trips on average.
    4. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.
  5. The aus_arrivals data set comprises quarterly international arrivals to Australia from Japan, New Zealand, UK and the US.

    • Use autoplot(), gg_season() and gg_subseries() to compare the differences between the arrivals from these four countries.
    • Can you identify any unusual observations?
  6. Monthly Australian retail data is provided in aus_retail. Select one of the time series as follows (but choose your own seed value):

    set.seed(12345678)
    myseries <- aus_retail |>
      filter(`Series ID` == sample(aus_retail$`Series ID`,1))

    Explore your chosen retail time series using the following functions:

    autoplot(), gg_season(), gg_subseries(), gg_lag(),

    ACF() |> autoplot()

    Can you spot any seasonality, cyclicity and trend? What do you learn about the series?

  1. Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.

    • Can you spot any seasonality, cyclicity and trend?
    • What do you learn about the series?
    • What can you say about the seasonal patterns?
    • Can you identify any unusual years?
  2. The following time plots and ACF plots correspond to four different time series. Your task is to match each time plot in the first row with one of the ACF plots in the second row.

  3. The aus_livestock data contains the monthly total number of pigs slaughtered in Victoria, Australia, from Jul 1972 to Dec 2018. Use filter() to extract pig slaughters in Victoria between 1990 and 1995. Use autoplot() and ACF() for this data. How do they differ from white noise? If a longer period of data is used, what difference does it make to the ACF?

    1. Use the following code to compute the daily changes in Google closing stock prices.

      dgoog <- gafa_stock |>
        filter(Symbol == "GOOG", year(Date) >= 2018) |>
        mutate(trading_day = row_number()) |>
        update_tsibble(index = trading_day, regular = TRUE) |>
        mutate(diff = difference(Close))
    2. Why was it necessary to re-index the tsibble?

    3. Plot these differences and their ACF.

    4. Do the changes in the stock prices look like white noise?