library(vdemdata)
library(tidyverse)
run <- isTRUE(params$completed)Lab 2
Wrangling and Exploring Democracy Data
Open the Lab 2 project in the course space on Posit Cloud to get started.
Overview
In this lab, you will practice the main skills from Modules 2.1 and 2.2. You will:
- Load and wrangle V-Dem data
- Summarize democracy indicators by region
- Create a column chart from summarized data
- Explore categorical regime data with
geom_bar() - Compare regime proportions across regions
- Write brief interpretations of your results
Fill in each ??? with the correct code. You are encouraged to have the Week 2 lecture materials and the V-Dem codebook open while completing the lab.
Getting Started
Load the required packages.
vdemdata locally
If you are working on your own computer and do not have vdemdata installed, install pak and then install vdemdata from GitHub:
install.packages("pak")
pak::pkg_install("vdeminstitute/vdemdata")Part 1: Wrangle Democracy Data (30 points)
In this part, you will build a clean dataset from V-Dem using filter(), select(), and mutate().
Step 1: Create a Wrangled Dataset (20 pts)
Filter the data to one year, select the variables you need, and recode the region variable into readable labels.
democracy <- vdem |>
???(year == ???) |>
???(
country = ???,
year,
polyarchy = ???,
libdem = ???,
gdp_pc = ???,
region = ???
) |>
???(
region = case_match(region,
1 ~ "Eastern Europe",
2 ~ "Latin America",
3 ~ "Middle East",
4 ~ "Africa",
5 ~ "The West",
6 ~ "Asia")
)Step 2: Examine the Data (5 pts)
Run glimpse() on your new data frame.
glimpse(???)Step 3: Brief Response (5 pts)
Write 2-3 sentences answering these questions:
- What year did you choose?
- How many rows and columns does your wrangled dataset have?
- Why is it useful to recode
regionbefore plotting?
YOUR RESPONSE HERE
Part 2: Summarize and Visualize by Region (35 points)
In this part, you will practice group_by(), summarize(), arrange(), and geom_col().
Step 1: Summarize the Data (15 pts)
Create a regional summary dataset. Use mean() for polyarchy and gdp_pc, and sort from highest to lowest polyarchy.
dem_summary <- democracy |>
???(region) |>
???(
polyarchy = ???(polyarchy, na.rm = TRUE),
gdp_pc = ???(gdp_pc, na.rm = TRUE)
) |>
???(desc(polyarchy))Step 2: Print the Summary (5 pts)
Print your summarized data frame.
???Step 3: Create a Column Chart (10 pts)
Create a column chart of average polyarchy by region. Reorder the bars from highest to lowest.
# Write your code hereStep 4: Brief Interpretation (5 pts)
Write 2-3 sentences describing what you see.
- Which region has the highest average polyarchy score?
- Which region has the lowest?
- Does the ranking make sense to you?
YOUR RESPONSE HERE
Part 3: Explore Categorical Regime Data (35 points)
In this part, you will work with V-Dem’s v2x_regime variable and use count() and geom_bar().
Step 1: Build a Regime Dataset (15 pts)
Filter the data to 2022, select country, regime, and region, and recode both region and regime.
vdem2022_regime <- vdem |>
???(year == ???) |>
???(
country = ???,
regime = ???,
region = ???
) |>
???(
region = case_match(region,
1 ~ "Eastern Europe",
2 ~ "Latin America",
3 ~ "Middle East",
4 ~ "Africa",
5 ~ "The West",
6 ~ "Asia"),
regime = case_match(regime,
0 ~ "Closed Autocracy",
1 ~ "Electoral Autocracy",
2 ~ "Electoral Democracy",
3 ~ "Liberal Democracy")
)Step 2: Count Regime Types (5 pts)
Create a frequency table of regime types.
vdem2022_regime |>
???(regime)Step 3: Create a Bar Chart of Regime Types (5 pts)
Use geom_bar() to plot the distribution of regime types.
# Write your code hereStep 4: Compare Regions with Proportions (5 pts)
Now create a second bar chart comparing regime types across regions using position = "fill".
# Write your code hereStep 5: Brief Interpretation (5 pts)
Write 2-3 sentences describing what you see.
- Which regime type is most common overall?
- Which region appears to have the largest share of liberal democracies?
- Why is
position = "fill"useful here?
YOUR RESPONSE HERE
- Replace “YOUR NAME HERE” at the top with your actual name
- Make sure all code chunks run without errors
- Change
completed: falsetocompleted: true - Render to HTML to check your work
- Change
format: htmltoformat: pdf - Submit the PDF to Blackboard
Hints
Only look at these if you are stuck.
Hint 1 - Wrangling structure
new_data <- vdem |>
filter(year == 2015) |>
select(
country = country_name,
year,
polyarchy = v2x_polyarchy
) |>
mutate(...)Hint 2 - Summarizing structure
summary_data <- democracy |>
group_by(region) |>
summarize(
polyarchy = mean(polyarchy, na.rm = TRUE)
) |>
arrange(desc(polyarchy))Hint 3 - geom_col()
ggplot(dem_summary, aes(x = reorder(region, -polyarchy), y = polyarchy)) +
geom_col()Hint 4 - geom_bar()
ggplot(vdem2022_regime, aes(x = regime)) +
geom_bar()Hint 5 - Proportional bar chart
ggplot(vdem2022_regime, aes(x = region, fill = regime)) +
geom_bar(position = "fill")