graph LR
A(Two<br>Variables) --> B(Scatterplot) --> C(Covariance) --> D(Correlation) --> E(Regression<br>Coefficient)
POP88162 Introduction to Quantitative Research Methods
Department of Political Science, Trinity College Dublin
graph LR
A(Two<br>Variables) --> B(Scatterplot) --> C(Covariance) --> D(Correlation) --> E(Regression<br>Coefficient)
Extra
^ operator in R to raise a given number to any power (exponentiate).exp() function.log() to calculate a logarithm given a number and a base.exp(b)
log(x, base = exp(1))
2 ^ b
log2(x)
10 ^ b
log10(x)
\(5^2\)
Natural logarithm: \(log_e 0.001\), where \(e \approx 2.71828\)
cov() and cor() functions, respectively.Let’s work out the test statistic for the correlation between regime longevity and GDP. \[t = \frac{r}{\sqrt{(1 - r^2)/(n - 2)}} = \frac{0.3667}{\sqrt{(1 - 0.134)/(175 - 2)}} = \frac{0.3667}{0.07} \approx 5.23\]
We can then find an associated two-tail \(p\)-value:
cor.test() function:
Pearson's product-moment correlation
data: democracy_gdp_2020$democracy_duration and democracy_gdp_2020$gdp_per_capita
t = 5.1845, df = 173, p-value = 5.995e-07
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.2309328 0.4884833
sample estimates:
cor
0.3667133
Pearson's product-moment correlation
data: log(democracy_gdp_2020$democracy_duration) and log(democracy_gdp_2020$gdp_per_capita)
t = 6.0222, df = 173, p-value = 1.005e-08
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.2855905 0.5317990
sample estimates:
cor
0.416297
While interesting, measure of proportion of variation given by \(r^2\) isn’t very intuitive.
It says nothing about the substantive importance or the size of this relationship.
Slope of the regression line (aka regression coefficient) is the most common focus when analysing relationships between quantitative variables.
Regression line minimises the sum of vertical distances between data points and itself.
It can be expressed as covariance divided by the squared standard deviations of one of the two variables: \[\beta_X = \frac{cov(X, Y)}{\sigma^2_X}\]
Intuitively, this number tells us how much \(Y\) changes, on average, as \(X\) increases by one unit.