library(fpp3)
STA 364 Spring 2025
Take-Home Midterm Exam
Due: Friday, January 24th, at 9 am.
Rules
- Your solutions must be written up in a Quarto (qmd) file and then rendered into an HTML file called
exam-01_LAST_NAME.html
. This file must include your code, output, and write-up for each question. When showing extensive table results, please use thehead()
function if it is over 20 rows.
You may add additional code chunks to the template provided but don’t forget to change the chunk labels.
You can use the course book, your notes, and some of the internet for this exam. You cannot ask AI or people about the exam. Anything you use that is not included in the book or notes must be cited. You may not consult with anyone other than the Professor about this exam. You cannot ask direct questions online or consult with each other, not even for hypothetical questions.
You will be required to upload the HTML file from your output. Technical difficulties are not an excuse for late work—do not wait until the last minute. Before uploading to Moodle, verify that your HTML file includes all graphs and tables. Use the embedded resources option in the YAML provided at the top of your template document.
Your analysis, outputs, and narratives should answer the questions, not your code.
Submission
When you are finished with your exam, be sure to Render the final document. Once rendered, you can download your file by:
- Finding the .html file in your File pane (on the bottom right of the screen)
- Click the check box next to the file.
- Click the blue gear above and then click “Export” to download
- Submit your final HTML document to the exam spot on Moodle
Exam
Setup
Data
The data is found in the ffp3 package and is called aus_vehicle_sales. Use ?aus_vehicle_sales
to learn more about the data. For this exam, you will explore the SUVs. The data has been split into a training set and the whole data.
<- aus_vehicle_sales |>
aus_suv filter(Type == "SUV")|>
select(-Type)
<-
suv_train |>
aus_suv filter(year(Month) <= 2016)
Questions
Plot and discuss the time series. Use
autoplot
,ggsubseries
andgg_season
graphs.Investigate if there is significant autocorrelation. Use
gg_lag()
andACF() |> autoplot()
to make the graphs. Does this information agree with your previous discussion in (1)?Using your already created graphs, is the relationship between season and trend additive or multiplicative? Explain. If multiplicative, explain what that implies about modeling and forecasting with your series.
Regardless of your previous answer, apply a log transformation on the Count and then do an STL decomposition on your series. Discuss if the decomposition components agree with your answers in (1) and (2). You will need
model(STL(log(Count))) |> components() |> autoplot()
.The code below can be modified to fit all benchmark models and a decomposition model to forecast the number of SUV sales in Australia in the next 1 year. Discuss which methods are best and why.
<- DATA |>
train_models model(
naive = NAIVE(log(VARIABLE)),
snaive = SNAIVE(log(VARIABLE)),
mean_mod = MEAN(log(VARIABLE)),
drift_mod = RW(log(VARIABLE) ~ drift()),
dcmp_model = decomposition_model(STL(log(VARIABLE)),
NAIVE(season_adjust)))
<- train_models |> forecast(h="CHANGE")
suv_forecast
|> autoplot(DATA)
suv_forecast
|> autoplot(DATA)+
suv_forecast labs(title = "Benchmark Forecasts for SUV Sales in Australia")
|> autoplot(filter(DATA,year(Month)>2015))+
suv_forecast labs(title = "Benchmark Forecasts for SUV Sales in Australia",
subtitle = "2016 to 2019",
caption = "Earlier years left out for visualization purposes")
- Now, we want to compare models using metrics. Use the provided code to get model metrics, then choose the best model.
|> accuracy() |> select(COMMA SEPARATED LIST OF METRICS)
train_models
|> accuracy(DATA) |> select(COMMA SEPARATED LIST OF METRICS) suv_forecast
Using the chosen model form in (6), refit that type of model on the complete data and provide forecasts (plot) for 2018. You should use parts of the code provided above.
Assess the final chosen model. You may want the functions
gg_tsresiduals()
,report()
, andaugment()
, to get information about the model and relevant plots, but you can also use others.