Exam guides

Note

This study guide is meant to help you; a student who knows all the below well should do well on the exam. It is not a contract. There is no room to negotiate over whether questions on the exams are perfectly represented in this list.

Things you should know for the midterm…

  • The structure of data (what are rows, columns in a dataset)
  • The grammar of graphics (variables, aesthetics, geometries)
  • How to interpret patterns in data visualizations (e.g., general trends, outliers)
  • The different types of variables that exist (e.g., categorical, dummies)
  • The five graphs, what they tell us, and when each is appropriate
  • How to make these graphs in R
  • Use of pipes to connect functions
  • Why we would want to subset data and how to do it
  • Use of logical operators
  • Use of objects
  • Why we would want to create new variables and how to do it
  • How to make new categorical variables out of existing data
  • The summary statistics and what they tell us
  • How to calculate summary statistics, in general, and for subgroups in the data
  • How to summarize categorical variables
  • How we can “break down” broad patterns in data to test theories or concerns about the relationships we’re observing
  • What correlations are, how we measure them, what they tell us
  • The basics of modeling, terminology, broad goals

Things you should know for the final…

  • The intuition of how model parameters are estimated
  • How to fit models with different kinds of variables and interpret the output
  • How to interpret model output in multiple regression
  • Why we want to make predictions, and the mechanics of making them
  • How to assess whether our predictions are good or bad
  • The difference between prediction and causal inference
  • What “causes” means in social science terms
  • Why causal inference is difficult, the fundamental problem of causality
  • How to use simulation to create the varying causal patterns in data that we explored
  • Why experiments work and how they differ from observational inferences
  • Using, reading, and making DAGs
  • The different confounding scenarios we discussed
  • Use of controls, in theory and practice
  • The motivation for the causal revolution, natural experiments, why they work, how we know when we have one, limitations
  • Why we are uncertain, how to simulate uncertainty, how the law of large numbers saves us, when it can’t
  • How to quantify uncertainty and its use in hypothesis testing