Mini Project 3

Multilevel Models

Evening Before (Thursday 9/12)

  • Pick a partner.

  • Find an appropriate data set.

    • Multilevel data may be harder to find.
    • The data you choose should have a few predictors that may be useful when modeling (ideally at least 1 quantitative and 1 categorical).
    • In this project, you can choose the data from the book between Chapter 9 Guided number 2 or any of Chapter 9’s Open Ended exercises. You are not going to answer their questions but use the questions data to answer your own questions.
    • I do ask that different groups choose different datasets. Choose a few options.

Data Options: Check out Data Links on the Useful Links part of the course website. TidyTuesday has quickly accessible data.

  • Write out the anticipated cleaning and/or feature engineering steps you will need to take. Some examples:

    • Creating simplified categorical variables or transforming a continuous variable into categorical.
    • Aggregating data
    • Converting date columns
    • etc.
  • Setup your R Project(s) on the server.

  • Read your data into R. This likely will require you to download the data to your computer and upload the data the server.

Project Day

Timing

This project will be in a workshop style. The intention is for you to start and finish by the end of class time. We will follow a timeline:

Task Timing
Clean Data 9:00 am - 9:30 am
Perform EDA 9:30 am - 10:30 am
Fit, Assess, and Compare Regression Models 10:30 am - 11:00 am, 1:00 pm - 2:00 pm
Prepare presentation 2:00 pm - 2:20 pm
Present your findings 2:20 pm - 2:40 pm
Submit your Final Report Submit HTML Sunday 9/8 at 11:59 pm

Grading

Each Mini Project is worth 50 points (Labs are 10 points each).

Category Points
R project was created the day before, data read into the environment, and cleaning steps were written. 2
The data chosen is appropriate, and the cleaning steps are correct and explained. 3
EDA is thorough. All graphs and tables included are paired with a discussion. EDA supports the choice of modeling technique. 15
The model fitting process has a logical flow. Multiple models are considered and compared using statistical tests and various metrics. Any interpreted model has been assessed using residual plots and appropriate statistical tests. 15
The code follows a sensible order and has been appropriately commented on. 5
The presentation is concise, describes the data, highlights key parts of the EDA, describes minimally the final model, gives at least 1 interpretation in the context of a coefficient, and discusses limitations and potential future work. 5
The report is well written, with correct spelling and grammar. The used code is included either inline or in an appendix at the end. 5

Submission

Add format part of your final report document and then re-render:

---
title: "Document title"
author: "my name"
format:
  html:
    embed-resources: true
---

When you are finished with your homework, be sure to Render the final document. Once rendered, you can download your file by:

  • Finding the .html file in your File pane (on the bottom right of the screen)
  • Click the check box next to the file
  • Click the blue gear above and then click “Export” to download
  • Submit your final html document to the respective assignment on Moodle