Uncertainty II

POL51

Juan Tellez

UC Davis

November 27, 2024

Plan for today

Quantifying uncertainty

The confidence interval

Are we sure it’s not zero?

Where are we at?

The problem

We know our analysis is based on samples, and different samples give different answers:

Average number of kids in samples of size 10
Sample Avg. num of kids in sample
1 1.4
2 2.4
3 1.5
4 2.1
5 1.5
6 2.0
7 1.2
8 2.0

The way out

  • Turns out that if our samples are representative of the population, then estimates from large samples will tend to be pretty damn close

  • So if sample is good ✅ and “big” ✅ then most of the time we’ll be OK ✅

But this is weird

We’ve shown that if we take many (large) random samples, most of the averages of those samples will be close to population parameter

But in real life we only ever have one sample (e.g., one poll)

How do we get a sense for uncertainty from our one sample?

Two approaches

Statistical theory

  • Make assumptions about distribution of samples from population
  • Design test based on those assumptions (t-test, z-test, etc.)

✅ Simulation

  • Simulate different samples that look like ours
  • Use distribution of simulated samples to quantify uncertainty

In some cases, both get you to the same answer, in others, only one works

Simulation

We want a sense for how uncertain we should feel on estimates drawn from our sample

Our sample is gss_sm, and we have 2,867 observations

gss_sm
year id ballot age childs sibs degree race
2016 2830 2 62 2 2 Bachelor White
2016 848 3 28 1 3 High School Black
2016 2186 2 32 0 1 High School White
2016 198 1 49 0 4 Bachelor Other
2016 985 1 50 1 1 High School White

How confident are we in the estimate we get from this sample, given its size?

How to simulate

If we take lots of samples of size 20 \(\rightarrow\) uncertainty in a sample of 20

gss_sm %>%
  rep_sample_n(size = 20, reps = 1000)

How to simulate

If we take lots of samples of size 100 \(\rightarrow\) uncertainty in a sample of 100

gss_sm %>%
  rep_sample_n(size = 100, reps = 1000)

How to simulate

So to see how uncertain we should feel about gss_sm, we should take many samples that are the same size as gss_sm

gss_sm %>% 
  rep_sample_n(size = 2867, reps = 1000)

OR

gss_sm %>% 
  rep_sample_n(size = nrow(gss_sm), reps = 1000)

Note

nrow(DATA) tells you how many observations in data object

Problem

If we have a dataset of 2,867 observations and ask R to randomly pick 2,867 observations, we’ll just get a bunch of copies of the original dataset

gss_sm %>% 
  rep_sample_n(size = nrow(gss_sm), reps = 1000)

Solution: sample with replacement \(\rightarrow\) once we draw an observation it goes back into the dataset, and can be sampled again

gss_sm %>% 
  rep_sample_n(size = nrow(gss_sm), reps = 1000, replace = TRUE)

Note

replace = TRUE allows an observation to be re-sampled

Sampling with replacement

Sampling with and without replacement

If we were sampling 4 of these delicious fruits:

Dataset on fruits
fruits
Mango
Pineapple
Banana
Blackberry

Sampling with and without replacement

Without replacement:

rep_sample_n(food, size = 4, reps = 1)
# A tibble: 4 × 2
# Groups:   replicate [1]
  replicate fruits    
      <int> <chr>     
1         1 Mango     
2         1 Pineapple 
3         1 Blackberry
4         1 Banana    

With replacement:

rep_sample_n(food, size = 4, reps = 1, replace = TRUE)
# A tibble: 4 × 2
# Groups:   replicate [1]
  replicate fruits    
      <int> <chr>     
1         1 Blackberry
2         1 Banana    
3         1 Banana    
4         1 Banana    

Bootstrapping

  • Generating many, same-sized samples with replacement is called bootstrapping
  • Replacement lets us generate samples that randomly differ from ours
  • Use the distribution of bootstrapped samples to quantify uncertainty

Only one sample? Pull yourself up by your bootstraps!

Back to the kids

How uncertain should we be of our estimate of the avg. number of kids in the US, Given that it’s based on our one sample, gss_sm? We can bootstrap:

boot_kids = gss_sm %>% 
  rep_sample_n(size = nrow(gss_sm), reps = 1000, replace = TRUE) %>% 
  summarise(avg_kids = mean(childs, na.rm = TRUE))
boot_kids
replicateavg_kids
11.89
21.84
31.82
41.79
51.84
61.9 
71.9 
81.83
91.83
101.88
111.89
121.88
131.84
141.78
151.79
161.84
171.83
181.88
191.8 
201.82
211.86
221.85
231.82
241.85
251.9 
261.86
271.84
281.95
291.85
301.89
311.89
321.82
331.83
341.83
351.84
361.81
371.88
381.81
391.84
401.84
411.86
421.92
431.86
441.8 
451.86
461.85
471.91
481.85
491.86
501.92
511.82
521.81
531.93
541.78
551.84
561.8 
571.79
581.88
591.85
601.83
611.85
621.84
631.88
641.78
651.87
661.83
671.88
681.86
691.83
701.84
711.9 
721.85
731.83
741.84
751.84
761.88
771.79
781.82
791.85
801.88
811.82
821.83
831.77
841.91
851.9 
861.86
871.9 
881.83
891.84
901.91
911.85
921.81
931.84
941.88
951.89
961.86
971.87
981.82
991.83
1001.82
1011.88
1021.83
1031.76
1041.84
1051.84
1061.8 
1071.87
1081.86
1091.86
1101.86
1111.84
1121.85
1131.87
1141.86
1151.87
1161.86
1171.87
1181.89
1191.82
1201.82
1211.89
1221.9 
1231.86
1241.85
1251.89
1261.83
1271.85
1281.8 
1291.83
1301.87
1311.81
1321.87
1331.83
1341.81
1351.88
1361.86
1371.88
1381.82
1391.82
1401.85
1411.83
1421.84
1431.85
1441.78
1451.83
1461.93
1471.9 
1481.83
1491.84
1501.91
1511.84
1521.79
1531.83
1541.83
1551.83
1561.85
1571.86
1581.86
1591.84
1601.82
1611.88
1621.76
1631.9 
1641.87
1651.88
1661.89
1671.84
1681.88
1691.9 
1701.85
1711.88
1721.83
1731.81
1741.84
1751.86
1761.8 
1771.83
1781.86
1791.87
1801.82
1811.87
1821.85
1831.91
1841.81
1851.83
1861.83
1871.84
1881.88
1891.84
1901.83
1911.85
1921.88
1931.87
1941.84
1951.84
1961.83
1971.9 
1981.78
1991.86
2001.81
2011.89
2021.85
2031.88
2041.85
2051.81
2061.87
2071.84
2081.81
2091.86
2101.85
2111.8 
2121.8 
2131.86
2141.88
2151.86
2161.86
2171.87
2181.86
2191.79
2201.89
2211.88
2221.85
2231.84
2241.81
2251.81
2261.8 
2271.84
2281.9 
2291.95
2301.86
2311.83
2321.84
2331.84
2341.84
2351.84
2361.87
2371.86
2381.84
2391.88
2401.91
2411.81
2421.87
2431.79
2441.86
2451.82
2461.83
2471.89
2481.85
2491.92
2501.84
2511.9 
2521.9 
2531.8 
2541.91
2551.9 
2561.83
2571.86
2581.82
2591.84
2601.86
2611.88
2621.83
2631.83
2641.86
2651.84
2661.87
2671.83
2681.87
2691.84
2701.85
2711.85
2721.84
2731.83
2741.87
2751.83
2761.86
2771.87
2781.87
2791.86
2801.87
2811.83
2821.83
2831.92
2841.9 
2851.88
2861.79
2871.84
2881.87
2891.83
2901.9 
2911.82
2921.84
2931.87
2941.88
2951.86
2961.85
2971.84
2981.84
2991.8 
3001.86
3011.91
3021.87
3031.83
3041.86
3051.81
3061.86
3071.79
3081.89
3091.8 
3101.91
3111.91
3121.86
3131.89
3141.82
3151.82
3161.88
3171.83
3181.86
3191.8 
3201.87
3211.83
3221.88
3231.87
3241.82
3251.81
3261.87
3271.81
3281.93
3291.85
3301.84
3311.85
3321.81
3331.89
3341.81
3351.87
3361.88
3371.85
3381.86
3391.84
3401.86
3411.82
3421.85
3431.85
3441.83
3451.89
3461.85
3471.85
3481.87
3491.88
3501.82
3511.86
3521.88
3531.87
3541.84
3551.85
3561.86
3571.86
3581.79
3591.88
3601.89
3611.92
3621.81
3631.86
3641.84
3651.89
3661.9 
3671.83
3681.84
3691.87
3701.9 
3711.88
3721.85
3731.88
3741.87
3751.86
3761.79
3771.91
3781.82
3791.84
3801.88
3811.84
3821.82
3831.89
3841.85
3851.84
3861.83
3871.86
3881.83
3891.87
3901.85
3911.86
3921.86
3931.83
3941.85
3951.81
3961.86
3971.86
3981.79
3991.82
4001.86
4011.86
4021.82
4031.81
4041.85
4051.8 
4061.89
4071.87
4081.89
4091.81
4101.85
4111.87
4121.83
4131.92
4141.89
4151.87
4161.85
4171.81
4181.86
4191.91
4201.88
4211.79
4221.87
4231.86
4241.85
4251.89
4261.88
4271.83
4281.84
4291.9 
4301.82
4311.87
4321.89
4331.87
4341.87
4351.89
4361.86
4371.93
4381.85
4391.86
4401.85
4411.84
4421.87
4431.81
4441.84
4451.86
4461.92
4471.82
4481.89
4491.82
4501.85
4511.83
4521.89
4531.86
4541.8 
4551.85
4561.87
4571.81
4581.87
4591.85
4601.82
4611.85
4621.84
4631.89
4641.85
4651.87
4661.88
4671.86
4681.82
4691.9 
4701.9 
4711.85
4721.84
4731.82
4741.83
4751.84
4761.81
4771.83
4781.82
4791.83
4801.86
4811.84
4821.83
4831.84
4841.82
4851.88
4861.85
4871.8 
4881.86
4891.84
4901.84
4911.86
4921.87
4931.8 
4941.84
4951.86
4961.88
4971.83
4981.84
4991.89
5001.83
5011.87
5021.84
5031.88
5041.84
5051.84
5061.83
5071.85
5081.87
5091.86
5101.81
5111.84
5121.87
5131.88
5141.82
5151.84
5161.83
5171.85
5181.81
5191.87
5201.86
5211.84
5221.87
5231.87
5241.84
5251.83
5261.83
5271.91
5281.78
5291.88
5301.9 
5311.8 
5321.81
5331.83
5341.81
5351.81
5361.79
5371.89
5381.8 
5391.85
5401.82
5411.9 
5421.92
5431.85
5441.84
5451.86
5461.82
5471.87
5481.81
5491.86
5501.87
5511.83
5521.82
5531.83
5541.92
5551.82
5561.79
5571.84
5581.8 
5591.84
5601.85
5611.84
5621.81
5631.85
5641.85
5651.82
5661.86
5671.83
5681.86
5691.83
5701.85
5711.88
5721.85
5731.88
5741.81
5751.82
5761.79
5771.84
5781.8 
5791.89
5801.84
5811.88
5821.84
5831.83
5841.84
5851.85
5861.86
5871.85
5881.82
5891.88
5901.83
5911.83
5921.88
5931.8 
5941.81
5951.82
5961.84
5971.86
5981.87
5991.87
6001.9 
6011.78
6021.84
6031.81
6041.89
6051.83
6061.85
6071.86
6081.83
6091.81
6101.87
6111.87
6121.87
6131.85
6141.86
6151.89
6161.84
6171.8 
6181.81
6191.83
6201.82
6211.85
6221.78
6231.82
6241.89
6251.9 
6261.83
6271.8 
6281.89
6291.85
6301.88
6311.88
6321.84
6331.83
6341.84
6351.87
6361.84
6371.85
6381.82
6391.85
6401.91
6411.86
6421.82
6431.86
6441.78
6451.85
6461.84
6471.85
6481.9 
6491.84
6501.84
6511.84
6521.85
6531.9 
6541.81
6551.86
6561.83
6571.79
6581.87
6591.91
6601.89
6611.91
6621.87
6631.85
6641.83
6651.83
6661.86
6671.85
6681.86
6691.86
6701.87
6711.82
6721.84
6731.84
6741.83
6751.87
6761.84
6771.88
6781.8 
6791.86
6801.86
6811.85
6821.9 
6831.86
6841.82
6851.8 
6861.83
6871.86
6881.82
6891.83
6901.89
6911.82
6921.83
6931.81
6941.82
6951.87
6961.91
6971.88
6981.87
6991.87
7001.89
7011.84
7021.84
7031.84
7041.83
7051.82
7061.81
7071.85
7081.83
7091.87
7101.84
7111.82
7121.89
7131.83
7141.85
7151.84
7161.89
7171.83
7181.79
7191.86
7201.84
7211.84
7221.84
7231.87
7241.85
7251.84
7261.89
7271.86
7281.86
7291.86
7301.83
7311.79
7321.87
7331.81
7341.87
7351.86
7361.88
7371.85
7381.84
7391.88
7401.85
7411.88
7421.84
7431.82
7441.78
7451.8 
7461.87
7471.83
7481.8 
7491.84
7501.79
7511.84
7521.86
7531.87
7541.92
7551.86
7561.86
7571.88
7581.85
7591.87
7601.77
7611.79
7621.83
7631.88
7641.85
7651.84
7661.83
7671.83
7681.87
7691.8 
7701.84
7711.88
7721.85
7731.91
7741.89
7751.86
7761.82
7771.81
7781.87
7791.85
7801.83
7811.88
7821.8 
7831.81
7841.82
7851.89
7861.88
7871.86
7881.81
7891.84
7901.87
7911.84
7921.84
7931.86
7941.87
7951.91
7961.9 
7971.85
7981.86
7991.84
8001.9 
8011.79
8021.89
8031.81
8041.8 
8051.83
8061.78
8071.84
8081.87
8091.81
8101.86
8111.88
8121.82
8131.84
8141.79
8151.87
8161.81
8171.88
8181.8 
8191.86
8201.9 
8211.87
8221.87
8231.85
8241.85
8251.88
8261.91
8271.84
8281.86
8291.89
8301.85
8311.84
8321.79
8331.83
8341.86
8351.83
8361.83
8371.87
8381.88
8391.86
8401.84
8411.87
8421.87
8431.84
8441.84
8451.89
8461.88
8471.82
8481.83
8491.89
8501.85
8511.8 
8521.85
8531.84
8541.91
8551.87
8561.86
8571.82
8581.87
8591.87
8601.87
8611.81
8621.91
8631.84
8641.93
8651.87
8661.8 
8671.84
8681.88
8691.85
8701.84
8711.9 
8721.87
8731.84
8741.83
8751.88
8761.85
8771.88
8781.9 
8791.82
8801.86
8811.8 
8821.89
8831.91
8841.94
8851.83
8861.93
8871.84
8881.83
8891.86
8901.85
8911.87
8921.88
8931.83
8941.82
8951.82
8961.79
8971.88
8981.84
8991.79
9001.87
9011.87
9021.89
9031.85
9041.85
9051.84
9061.94
9071.89
9081.86
9091.85
9101.89
9111.85
9121.76
9131.86
9141.91
9151.84
9161.94
9171.88
9181.82
9191.89
9201.8 
9211.82
9221.86
9231.86
9241.88
9251.83
9261.86
9271.85
9281.82
9291.84
9301.89
9311.81
9321.86
9331.83
9341.83
9351.9 
9361.85
9371.85
9381.87
9391.84
9401.85
9411.83
9421.88
9431.91
9441.83
9451.82
9461.84
9471.87
9481.88
9491.9 
9501.88
9511.85
9521.79
9531.84
9541.87
9551.83
9561.87
9571.88
9581.86
9591.83
9601.87
9611.86
9621.82
9631.84
9641.85
9651.87
9661.84
9671.84
9681.86
9691.81
9701.87
9711.83
9721.83
9731.9 
9741.85
9751.83
9761.88
9771.86
9781.86
9791.88
9801.85
9811.83
9821.84
9831.86
9841.87
9851.82
9861.84
9871.77
9881.84
9891.87
9901.82
9911.87
9921.88
9931.84
9941.81
9951.84
9961.84
9971.82
9981.79
9991.85
10001.83

The distribution of bootstrapped estimates

Our estimate and how much simulated estimates might vary across bootstrapped samples that look like ours

The red is the distribution of bootstrapped sample estimates \(\rightarrow\) the sampling distribution

Your turn: Income and household assets

wealth is all dummy variables, tell you whether household in Honduras 🇭🇳 has a particular asset or not:

Sample from wealth
r1 r3 r4 r4a r5 r6 r7 r8 r12 r14 r15 r16 r18 ur ed q10new_18 q14 fs2 fs8 wf1
1 1 0 1 0 0 0 1 1 1 0 0 0 Urban 7 2 0 0 0 0
1 1 0 1 0 0 1 0 1 1 0 0 0 Rural 3 NA 0 0 0 0
1 1 0 1 0 0 0 0 1 0 1 0 1 Rural 6 1 0 0 0 0
1 1 0 1 0 0 0 0 0 0 0 0 0 Rural 2 1 1 0 0 0
0 0 0 1 0 0 0 0 1 1 0 NA 0 Urban 2 6 0 1 1 0

Your turn: Income and household assets

Using wealth from juanr, pick an asset from the codebook:

  1. What percent of households own that asset?

  2. How uncertain should you be of your estimate? Generate 1,000 bootstraps and plot the distribution.

10:00

Quantifying uncertainty

How to quantify uncertainty?

The red histogram is nice, but how can we communicate uncertainty in our estimates in a pithy, more comparable way?

Three approaches:

  • The standard error
  • The confidence interval
  • Statistical significance

The standard error

One way to quantify uncertainty would be to measure how “wide” the distribution of bootstrapped sample estimates is

As we learned so long ago, one way to measure the “spread” of a distribution (i.e., how much a variable varies), is with the standard deviation

The standard deviation of the sampling distribution is called the standard error, or the margin of error

The standard error

Generate bootstraps:

boot_kids = gss_sm %>% 
  rep_sample_n(size = nrow(gss_sm), reps = 1000, replace = TRUE) %>% 
  summarise(avg_kids = mean(childs, na.rm = TRUE))

Calculate the mean and standard error of the bootstraps:

boot_kids %>% 
  summarise(mean = mean(avg_kids), standard_error = sd(avg_kids))
meanstandard_error
1.850.0303

Best guess? About 1.85 kids, +/- 2 standard errors (1.85 - 2 * .03 = 1.79, 1.85 + 2 * .03 = 1.91)

Standard error

This is what you see in the news – that +/- polling/margin of error

Varying uncertainty

Standard errors get smaller as sample sizes get larger:

Sample size Average (truth = 10) Standard error
10 10.28 0.46
64 10.23 0.25
119 9.82 0.20
173 9.94 0.15
228 9.97 0.12
282 10.15 0.12
337 10.11 0.12
391 10.16 0.10
446 10.06 0.10
500 9.93 0.09

The confidence interval

The confidence interval

Another way to quantify uncertainty is to look where most estimates fall

this is the confidence interval: our “best guess” of what we’re trying to estimate

How big to make the interval?

You could report (for example) where the middle 50% of bootstraps fall, or (for example) where the middle 95% of bootstraps fall, but there are tradeoffs!

The tradeoff

  • You are 50% “confident” that avg. number of kids could vary between 1.83 and 1.87. Narrower range! But low confidence!

  • You are 95% “confident” that avg. number of kids could vary between 1.79 and 1.91. Higher range! But higher confidence!

How big to make the interval?

Convention is to look at the middle 95% of the distribution

We can use the quantile() function to find the upper and lower bound of the middle 95%:

boot_kids %>% 
  summarise(low = quantile(avg_kids, .025), # middle 95% means lower bound is .025
            mean = mean(avg_kids), 
            high = quantile(avg_kids, .975)) # middle 95% means upper bound is .975
lowmeanhigh
1.791.851.91

The 95% confidence confidence interval for the average number of kids in the US is: (1.80, 1.91)

Mirrors of one another

The standard error and confidence interval are actually telling you the same thing

A 95% confidence interval is roughly equal to the Estimate +/- 1.96 \(\times\) standard error

boot_kids %>% 
  summarise(low = quantile(avg_kids, .025),
            mean = mean(avg_kids), 
            high = quantile(avg_kids, .975))
lowmeanhigh
1.791.851.91
boot_kids %>% 
  summarise(mean = mean(avg_kids), standard_error = sd(avg_kids)) %>% 
  mutate(low = mean - 1.96 * standard_error, 
         high = mean + 1.96 * standard_error) %>% 
  select(low, mean, high)
lowmeanhigh
1.791.851.91

🚨 Your turn: Wealth data 🚨

Look back at the wealth data

  1. Grab your bootstrapped samples for the asset of your choosing.

  2. Calculate the standard error and the 95% confidence interval of your best guess. Convince yourself the two can be made equivalent.

10:00