Confidence Intervals

April 23, 2024

On December 19, 2014, the front page of Spanish national newspaper El País read “Catalan public opinion swings toward ‘no’ for independence, says survey”.¹

The probability of the tiny difference between the ‘No’ and ‘Yes’ being just due to random chance is very high.¹

Characterizing Uncertainty

We know from previous section that even unbiased procedures do not get the “right” answer every time
We also know that our estimates might vary from sample to sample due to random chance
Therefore we want to report on our estimate and our level of uncertainty

Characterizing Uncertainty

With M&Ms, we knew the population parameter
In real life, we do not!
We want to generate an estimate and characterize our uncertainty with a range of possible estimates

Solution: Create a Confidence Interval

A plausible range of values for the population parameter is a confidence interval.

95 percent confidence interval is standard
- We are 95% confident that the parameter value falls within the range given by the confidence interval

Ways to Estimate

Take advantage of Central Limit Theorem to estimate using math
Use simulation, bootstrapping

With Math…

\[CI = \bar{x} \pm Z \left( \frac{\sigma}{\sqrt{n}} \right)\]

\(\bar{x}\) is the sample mean,
\(Z\) is the Z-score corresponding to the desired level of confidence
\(\sigma\) is the population standard deviation, and
\(n\) is the sample size

This part here represents the standard error:

\[\left( \frac{\sigma}{\sqrt{n}} \right)\]

Standard deviation of the sampling distribution
Characterizes the spread of the sampling distribution
The bigger this is the bigger the CIs are going to be

Central Limit Theorem

\[CI = \bar{x} \pm Z \left( \frac{\sigma}{\sqrt{n}} \right)\]

This way of doing things depends on the Central Limit Theorem
As sample size gets bigger, the spread of the sampling distribution gets narrower
The shape of the sampling distributions becomes more normally distributed

\[CI = \bar{x} \pm Z \left( \frac{\sigma}{\sqrt{n}} \right)\]

This is therefore a parametric method of calculating the CI. It depends on assumptions about the normality of the distribution.

Bootstrapping

Pulling oneself up from their bootstraps …
Use the data we have to estimate the sampling distribution
We call this the bootstrap distribution
This is a nonparametric method
It does not depend on assumptions about normality

Bootstrap Process

Take a bootstrap sample - a random sample taken with replacement from the original sample, of the same size as the original sample

Calculate the bootstrap statistic - a statistic such as mean, median, proportion, slope, etc. computed on the bootstrap samples

Repeat steps (1) and (2) many times to create a bootstrap distribution - a distribution of bootstrap statistics

Calculate the bounds of the XX% confidence interval as the middle XX% of the bootstrap distribution (usually 95 percent confidence interval)

Russia

What Proportion of Russians believe their country interfered in the 2016 presidential elections in the US?

Pew Research survey
506 subjects
Data available in the openintro package

For this example, we will use data from the Open Intro package. Install that package before running this code chunk.

#install.packages("openintro")
library(openintro)

glimpse(russian_influence_on_us_election_2016)

Rows: 506
Columns: 1
$ influence_2016 <chr> "Did not try", "Did not try", "Did not try", "Don't kno…

Let’s use mutate() to recode the qualitative variable as a numeric one…

russiaData <- russian_influence_on_us_election_2016 |> 
  mutate(try_influence = ifelse(influence_2016 == "Did try", 1, 0))

Now let’s calculate the mean and standard deviation of the try_influence variable…

russiaData |>
  summarize( 
          mean = mean(try_influence),
          sd = sd(try_influence)
  )

# A tibble: 1 × 2
   mean    sd
  <dbl> <dbl>
1 0.150 0.358

And finally let’s draw a bar plot…

ggplot(russiaData, aes(x = try_influence)) +
  geom_bar(fill = "steelblue", width = .75) +
  labs(
    title = "Did Russia try to influence the U.S. election?",
    x = "0 = 'No', 1 = 'Yes'",
    y = "Frequncy"
  ) +
  theme_minimal()

Bootstrap with `tidymodels`

Install tidymodels before running this code chunk…

#install.packages("tidymodels")
library(tidymodels)

set.seed(66)
boot_df <- russiaData |>
  # specify the variable of interest
  specify(response = try_influence) |>
  # generate 15000 bootstrap samples
  generate(reps = 15000, type = "bootstrap") |>
  # calculate the mean of each bootstrap sample
  calculate(stat = "mean")

glimpse(boot_df)

Rows: 15,000
Columns: 2
$ replicate <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
$ stat      <dbl> 0.1146245, 0.1442688, 0.1343874, 0.1877470, 0.1521739, 0.138…

Calculate the confidence interval. A 95% confidence interval is bounded by the middle 95% of the bootstrap distribution.

boot_df |>
  summarize(lower = quantile(stat, 0.025),
            upper = quantile(stat, 0.975))

# A tibble: 1 × 2
  lower upper
  <dbl> <dbl>
1 0.119 0.182

Create upper and lower bounds for visualization.

# for using these values later
lower_bound <- boot_df |> summarize(lower_bound = quantile(stat, 0.025)) |> pull() 
upper_bound <- boot_df |> summarize(upper_bound = quantile(stat, 0.975)) |> pull()

Visualize with a histogram

ggplot(data = boot_df, mapping = aes(x = stat)) +
  geom_histogram(binwidth =.01, fill = "steelblue4") +
  geom_vline(xintercept = c(lower_bound, upper_bound), color = "darkgrey", size = 1, linetype = "dashed") +
  labs(title = "Bootstrap distribution of means",
       subtitle = "and 95% confidence interval",
       x = "Estimate",
       y = "Frequency") +
  theme_bw()

Interpret the confidence interval

The 95% confidence interval was calculated as (lower_bound, upper_bound). Which of the following is the correct interpretation of this interval?

(a) 95% of the time the percentage of Russian who believe that Russia interfered in the 2016 US elections is between lower_bound and upper_bound.

(b) 95% of all Russians believe that the chance Russia interfered in the 2016 US elections is between lower_bound and upper_bound.

(c) We are 95% confident that the proportion of Russians who believe that Russia interfered in the 2016 US election is between lower_bound and upper_bound.

(d) We are 95% confident that the proportion of Russians who supported interfering in the 2016 US elections is between lower_bound and upper_bound.

Your Turn!

Change the reps argument in the generate() function to 1000. What happens to the width of the confidence interval?
Change the reps argument in the generate() function to 5000. What happens to the width of the confidence interval?
Change the reps argument in the generate() function to 10000. What happens to the width of the confidence interval?
How does the width of the confidence interval change as the number of bootstrap samples increases?
How would you interpret this finding?

Bias vs Precision

A procedure is unbiased if it generates the “right” answer, on average
Precision refers to variability: procedures with less sampling variability will be more precise
- all else equal, a greater sample size will increase precision
When we increase the sample size (number of reps), we increase precision
As a result our confidence interval will be narrower

Why did we do these simulations?

They provide a foundation for statistical inference and for characterizing uncertainty in our estimates
The best research designs will try to maximize or achieve good balance on bias vs precision

Confidence Intervals

Characterizing Uncertainty

Characterizing Uncertainty

Solution: Create a Confidence Interval

Ways to Estimate

With Math…

Central Limit Theorem

Bootstrapping

Bootstrap Process

Russia

Bootstrap with tidymodels

Interpret the confidence interval

Your Turn!

Bias vs Precision

Why did we do these simulations?

Bootstrap with `tidymodels`