April 23, 2024
Goal: Estimate Democracy score (\(\hat{Y_{i}}\)) of a country given level of GDP per capita (\(X_{i}\)).
Or: Estimate relationship between GDP per capita and democracy.
Step 1: Specify model
Step 2: Set model fitting engine
Step 3: Fit model & estimate parameters
… using formula syntax
Step 4: Tidy things up…
\[\widehat{Democracy}_{i} = 0.13 + 0.12 * {loggdppc}_{i}\]
\[\widehat{Democracy}_{i} = 0.13 + 0.12 * {loggdppc}_{i}\]
How do we get the “best” values for the slope and intercept?
Residual for each point is: \(e_i = y_i - \hat{y}_i\)
Least squares regression line minimizes \(\sum_{i = 1}^n e_i^2\).
Why not take absolute value?
What should the slope and intercept be?
\(\hat{Y} = 0 + 1*X\)
What is the sum of squared residuals?
What is sum of squared residuals for \(y = 0 + 0*X\)?
What is sum of squared residuals for \(y = 0 + 0*X\)?
What is sum of squared residuals for \(y = 0 + 2*X\)?
What is sum of squared residuals for \(y = 0 + 2*X\)?
What is sum of squared residuals for \(y = 0 + -1*X\)?
What is sum of squared residuals for \(y = 0 + -1*X\)?
Sum of Squared Residuals as function of possible values of \(b\)
When we estimate a least squares regression, it is looking for the line that minimizes sum of squared residuals
In the simple example, I set \(a=0\) to make it easier. More complicated when searching for combination of \(a\) and \(b\) that minimize, but same basic idea
There is a way to solve for this analytically for linear regression (i.e., by doing math…)
– They made us do this in grad school…
Are democracies less corrupt?
V-Dem includes a Political Corruption Index, which aggregates corruption in a number of spheres (see codebook for details).
The variable name is: v2x_corr : lower values mean less corruption
See started code HERE
Are democracies less corrupt?
10:00