STA 364 Spring 2025

Take-Home Midterm Exam

Due: Friday, January 24th, at 9 am.

Rules

Your solutions must be written up in a Quarto (qmd) file and then rendered into an HTML file called exam-01_LAST_NAME.html. This file must include your code, output, and write-up for each question. When showing extensive table results, please use the head() function if it is over 20 rows.

Note

You may add additional code chunks to the template provided but don’t forget to change the chunk labels.

You can use the course book, your notes, and some of the internet for this exam. You cannot ask AI or people about the exam. Anything you use that is not included in the book or notes must be cited. You may not consult with anyone other than the Professor about this exam. You cannot ask direct questions online or consult with each other, not even for hypothetical questions.
You will be required to upload the HTML file from your output. Technical difficulties are not an excuse for late work—do not wait until the last minute. Before uploading to Moodle, verify that your HTML file includes all graphs and tables. Use the embedded resources option in the YAML provided at the top of your template document.
Your analysis, outputs, and narratives should answer the questions, not your code.

Submission

When you are finished with your exam, be sure to Render the final document. Once rendered, you can download your file by:

Finding the .html file in your File pane (on the bottom right of the screen)
Click the check box next to the file.
Click the blue gear above and then click “Export” to download
Submit your final HTML document to the exam spot on Moodle

Exam

Setup

library(fpp3)

Data

The data is found in the ffp3 package and is called aus_vehicle_sales. Use ?aus_vehicle_sales to learn more about the data. For this exam, you will explore the SUVs. The data has been split into a training set and the whole data.

aus_suv <- aus_vehicle_sales |> 
  filter(Type == "SUV")|>
  select(-Type)

suv_train <- 
  aus_suv |> 
  filter(year(Month) <= 2016)

Questions

Plot and discuss the time series. Use autoplot, ggsubseries and gg_season graphs.
Investigate if there is significant autocorrelation. Use gg_lag() and ACF() |> autoplot() to make the graphs. Does this information agree with your previous discussion in (1)?
Using your already created graphs, is the relationship between season and trend additive or multiplicative? Explain. If multiplicative, explain what that implies about modeling and forecasting with your series.
Regardless of your previous answer, apply a log transformation on the Count and then do an STL decomposition on your series. Discuss if the decomposition components agree with your answers in (1) and (2). You will need model(STL(log(Count))) |> components() |> autoplot().
The code below can be modified to fit all benchmark models and a decomposition model to forecast the number of SUV sales in Australia in the next 1 year. Discuss which methods are best and why.

train_models <- DATA |> 
  model(
    naive = NAIVE(log(VARIABLE)),
    snaive = SNAIVE(log(VARIABLE)),
    mean_mod = MEAN(log(VARIABLE)),
    drift_mod = RW(log(VARIABLE) ~ drift()),
    dcmp_model =  decomposition_model(STL(log(VARIABLE)),
                                      NAIVE(season_adjust)))

suv_forecast <- train_models |> forecast(h="CHANGE")

suv_forecast |> autoplot(DATA)

suv_forecast |> autoplot(DATA)+
  labs(title = "Benchmark Forecasts for SUV Sales in Australia")

suv_forecast |> autoplot(filter(DATA,year(Month)>2015))+
  labs(title = "Benchmark Forecasts for SUV Sales in Australia",
       subtitle = "2016 to 2019",
       caption = "Earlier years left out for visualization purposes")

Now, we want to compare models using metrics. Use the provided code to get model metrics, then choose the best model.

train_models |> accuracy() |> select(COMMA SEPARATED LIST OF METRICS)

suv_forecast |> accuracy(DATA) |> select(COMMA SEPARATED LIST OF METRICS)

Using the chosen model form in (6), refit that type of model on the complete data and provide forecasts (plot) for 2018. You should use parts of the code provided above.
Assess the final chosen model. You may want the functions gg_tsresiduals(), report(), and augment(), to get information about the model and relevant plots, but you can also use others.