library(tidyverse)
library(tidymodels)
library(openintro)
Lab 11 - Grading the professor, Pt. 2
Modelling with multiple predictors
In this lab we revisit the professor evaluations data we modeled in the previous lab. In the last lab we modeled evaluation scores using a single predictor at a time. This time we will use multiple predictors to model evaluation scores.
For context, review the previous lab’s introduction before continuing on to the exercises.
Learning goals
- Fitting a linear regression with multiple predictors
- Interpreting regression output in context of the data
- Comparing models
Getting started
Go to the course GitHub organization and locate your lab repo, clone it in RStudio and open the R Markdown document. Knit the document to make sure it compiles without errors.
Warm up
Let’s warm up with some simple exercises. Update the YAML of your R Markdown file with your information, knit, commit, and push your changes. Make sure to commit with a meaningful commit message. Then, go to your repo on GitHub and confirm that your changes are visible in your Rmd and md files. If anything is missing, commit and push again.
Packages
We’ll use the tidyverse package for much of the data wrangling and visualization, the tidymodels package for modeling and inference, and the data lives in the dsbox package. These packages are already installed for you. You can load them by running the following in your Console:
Data
The data can be found in the openintro package, and it’s called evals
. Since the dataset is distributed with the package, we don’t need to load it separately; it becomes available to us when we load the package. You can find out more about the dataset by inspecting its documentation, which you can access by running ?evals
in the Console or using the Help menu in RStudio to search for evals
. You can also find this information here.
Exercises
- Load the data by including the appropriate code in your R Markdown file.
Simple linear regression
- Fit a linear model (one you have fit before):
score_bty_fit
, predicting average professor evaluationscore
based on average beauty rating (bty_avg
) only. Write the linear model, and note the \(R^2\) and the adjusted \(R^2\).
🧶 ✅ ⬆️ If you haven’t done so recently, knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.
Multiple linear regression
Fit a linear model (one you have fit before):
score_bty_gen_fit
, predicting average professor evaluationscore
based on average beauty rating (bty_avg
) andgender
. Write the linear model, and note the \(R^2\) and the adjusted \(R^2\).Interpret the slopes and intercept of
score_bty_gen_fit
in context of the data.What percent of the variability in
score
is explained by the modelscore_bty_gen_fit
.What is the equation of the line corresponding to just male professors?
For two professors who received the same beauty rating, which gender tends to have the higher course evaluation score?
How does the relationship between beauty and evaluation score vary between male and female professors?
How do the adjusted \(R^2\) values of
score_bty_gen_fit
andscore_bty_fit
compare? What does this tell us about how usefulgender
is in explaining the variability in evaluation scores when we already have information on the beauty score of the professor.Compare the slopes of
bty_avg
under the two models (score_bty_fit
andscore_bty_gen_fit
). Has the addition ofgender
to the model changed the parameter estimate (slope) forbty_avg
?Create a new model called
score_bty_rank_fit
withgender
removed andrank
added in. Write the equation of the linear model and interpret the slopes and intercept in context of the data.
🧶 ✅ ⬆️ Knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards and review the md document on GitHub to make sure you’re happy with the final state of your work.