library(tidyverse)Lab 1
Your First Data Visualizations
Open the Lab 1 project in the course space on Posit Cloud to get started.
Overview
In this lab, you will practice adapting code from the lectures to work with a new dataset. You will:
- Load and explore new datasets
- Create a bar chart by adapting code from Lesson 1.2
- Create a line chart
- Write brief interpretations of your visualizations
- Render your document to PDF and submit
Fill in each ??? with the correct code. You are encouraged to have the lecture materials open while completing this lab.
Getting Started
Load the tidyverse package (make sure that you have it installed).
The Data
Today we’ll work with data from the Gapminder project, which tracks development indicators across countries. We have two files:
gapminder_summary.csv- Average values by continentgapminder_full.csv- Data for 142 countries from 1952 to 2007
Part 1: Bar Chart (50 points)
Step 1: Load and Explore the Data (10 pts)
First, load the summary dataset and use glimpse() to see what variables are available.
# Load the data (this is done for you)
gapminder_summary <- read_csv("gapminder_summary.csv")Rows: 5 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): continent
dbl (4): avg_life_exp, avg_gdp_cap, total_pop, n_countries
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Use glimpse() to explore the dataQuestion: What variables are in this dataset? How many columns are there? How many rows?
YOUR ANSWER HERE
Step 2: Create a Bar Chart (20 pts)
Using the bar chart code from Lesson 1.2 as a template, create a bar chart using one of the variables in the dataframe. Your options are avg_life_exp, avg_gdp_cap, total_pop, or n_countries.
Think about what needs to change:
- What is the name of your data frame?
- What variable should go on the x-axis?
- What variable should go on the y-axis?
- What should your title and axis labels say?
# Write your bar chart code hereStep 3: Interpret Your Chart (10 pts)
Write 2-3 sentences describing what you see. Which continent has the highest value of the variable you chose to visualize? Which has the lowest?
YOUR INTERPRETATION HERE
Part 2: Line Chart (50 pts)
Step 1: Load and Explore the Data (10 pts)
Load the full Gapminder dataset and use glimpse() to explore it.
gapminder <- read_csv("gapminder_full.csv")Rows: 1704 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): country, continent
dbl (4): year, population, life_exp, gdp_cap
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Use glimpse() to explore the dataQuestion: What variables are in this dataset? What years are covered? What does each row represent?
YOUR ANSWER HERE
Step 2: Filter the Data (10 pts)
Choose three countries you’re interested in and filter the data to include only those countries. Replace the ___ placeholders with real country names — spelling must match exactly what you saw in glimpse().
gapminder_ctrs <- gapminder |>
filter(country %in% c("___", "___", "___"))Step 3: Create a Line Chart (20 pts)
Using the line chart code from Lesson 1.2 as a template, create a line chart showing how life expectancy (life_exp) has changed over time for your three countries. Use year on the x-axis and life_exp on the y-axis.
# Write your line chart code hereStep 4: Interpret Your Chart (10 pts)
Write 2-3 sentences describing what you see. How has life expectancy changed over time in your chosen countries? Are there differences across them?
YOUR INTERPRETATION HERE
- Replace “YOUR NAME HERE” at the top with your actual name
- Make sure all your code chunks run without errors
- Change
format: htmltoformat: pdfin the YAML header - Click “Render” to create your PDF
- Submit the PDF to Blackboard
Hints
Only look at these if you’re stuck!
Hint 1 - Bar chart structure:
ggplot(data_name, aes(x = ___, y = ___)) +
geom_col() +
labs(title = "___", x = "___", y = "___")
Hint 2 - Line chart structure:
ggplot(data_name, aes(x = year, y = life_exp, color = country)) +
geom_line(linewidth = 1)
Hint 3 - Variable names matter: Make sure the variable names in your code match exactly what you saw in glimpse(). R is case-sensitive!