1 Introduction

Why are some countries more democratic than others? Existing accounts assume that improvements in democracy is driven mostly by economic development (GDP), coalitions, inequality, or resource abundance which seemingly eliminates the need to study this question. I challenge this view by arguing that institutions, particularly education, plays a more meaningful role in promoting democracy.

# load libraries
library(googleVis)
library(dplyr)
library(tidyr)
library(ggplot2)
library(plotly)
library(knitr)
library(plm)
library(MASS)
library(lmtest)

df <- read.csv("https://raw.githubusercontent.com/afogarty85/replications/master/Education%20and%20Democracy/education_democracy.csv")

summary(df)

##         country          year           tyr               pri         
##  Afghanistan:  39   Min.   :1820   Min.   : 0.0049   Min.   :  0.000  
##  Albania    :  39   1st Qu.:1865   1st Qu.: 0.3850   1st Qu.:  2.746  
##  Algeria    :  39   Median :1915   Median : 2.1626   Median : 32.279  
##  Argentina  :  39   Mean   :1915   Mean   : 3.2381   Mean   : 42.654  
##  Australia  :  39   3rd Qu.:1965   3rd Qu.: 5.2344   3rd Qu.: 85.000  
##  Austria    :  39   Max.   :2010   Max.   :13.2446   Max.   :100.000  
##  (Other)    :3861                  NA's   :1050                       
##       sec                ter                democ            autoc       
##  Min.   :  0.0000   Min.   :  0.00000   Min.   : 0.000   Min.   : 0.000  
##  1st Qu.:  0.0000   1st Qu.:  0.00000   1st Qu.: 0.000   1st Qu.: 0.000  
##  Median :  0.7481   Median :  0.09498   Median : 2.000   Median : 3.000  
##  Mean   : 14.3694   Mean   :  5.06813   Mean   : 3.953   Mean   : 3.641  
##  3rd Qu.: 14.5832   3rd Qu.:  2.00000   3rd Qu.: 8.000   3rd Qu.: 7.000  
##  Max.   :100.0000   Max.   :100.00000   Max.   :10.000   Max.   :10.000  
##                                         NA's   :1720     NA's   :1720    
##       gdp              region         population             area         
##  Min.   :  231.9   Min.   : 1.000   Min.   :1.520e+05   Min.   :    2040  
##  1st Qu.: 1643.2   1st Qu.: 2.000   1st Qu.:3.077e+06   1st Qu.:   92120  
##  Median : 3251.2   Median : 4.000   Median :7.769e+06   Median :  301340  
##  Mean   : 5604.3   Mean   : 3.968   Mean   :3.226e+07   Mean   : 1275891  
##  3rd Qu.: 6497.8   3rd Qu.: 5.000   3rd Qu.:2.397e+07   3rd Qu.:  799380  
##  Max.   :80413.1   Max.   :10.000   Max.   :1.310e+09   Max.   :22100000  
##  NA's   :2341      NA's   :1720     NA's   :2025        NA's   :3038      
##     democavg         autocavg    
##  Min.   : 0.000   Min.   :0.000  
##  1st Qu.: 1.667   1st Qu.:2.381  
##  Median : 2.882   Median :3.875  
##  Mean   : 3.746   Mean   :3.739  
##  3rd Qu.: 5.545   3rd Qu.:5.118  
##  Max.   :10.000   Max.   :8.778  
##

2 Data

In this analysis, we are going to examine the effect of education on levels of democracy across 105 countries from 1820 to 2010. The dataset used in this analysis stitches together three different datasets, of which the reader can learn more about in the sources section below. A brief description of the variables is as follows:

country - country name
year - year
tyr - total years of schooling, population aged 15-64
pri - primary school adjusted enrollment ratio %
sec - secondary school adjusted enrollment ratio %
ter - tertiary school adjusted enrollment ratio %
democ - polity democracy measure; 0 (min), 10 (max)
autoc - polity autocracy measure; 0 (min), 10 (max)
gdp - GDP in year 2000 international dollars
region - region of the world; 1) Eastern Europe and post Soviet Union (including Central Asia); 2) Latin America (including Cuba, Haiti & Dominican Republic); 3) North Africa & Middle East (including Israel, Turkey and Cyprus); 4) Subsaharan Africa; 5) Western Europe & North America (including Australia & New Zealand); 6) East Asia (Including Japan & Mongolia); 7) Southeast Asia; 8) South Asia; 9; The Pacific (excluding Australia & New Zealand); 10) The Caribbean
population - population
area - size of country

2.1 Geographic Coverage of our Data

The plot below shows the geographic extent of our dataset. Specifically, it shows the average number of years of education that citizens (age 15-64) acquired from school in 2010.

htmltools::includeHTML("worldplot2.html")

Following a similar pattern, the plot below shows each country’s polity democracy score in 2010. What these two graphs tell us is that the intensity of the color green tends to covary across both variables: as citizens acquire more education, the level of a country’s democracy increases.

htmltools::includeHTML("worldplot3.html")

2.2 Temporal Coverage of our Data

Since we are evaluating time series cross-sectional data, we might also want to better understand the temporal coverage of our data.

kable(df %>%
        summarise(distinct_countries = n_distinct(country),
                  distinct_years = n_distinct(year),
                  min_year = min(year),
                  max_year = max(year)))

distinct_countries	distinct_years	min_year	max_year
105	39	1820	2010

2.3 Dependent Variable

The dependent variable is democ, or the level of democracy within a country at time \(t\). A brief look at our dependent variable shows that within our sample, the level of democracy has been increasing over time.

df_bar <- df %>% group_by(year) %>% summarize_at(vars(democ), list(mean), na.rm = TRUE)

plot_ly(data = df_bar, type = 'bar') %>%
  add_trace(x = ~year, y = ~democ, name  = "Polity Score") %>%
  layout(xaxis = list(
    title = 'Year'),
    yaxis = list(
      title = 'Sample Democracy Levels'))

2.4 Primary Independent Variable

The primary independent variable is tyr, or the total years of schooling at time \(t\). Following a similar pattern, the average adult’s level of education is also increasing over time.

df_bar <- df %>% group_by(year) %>% summarize_at(vars(tyr), list(mean), na.rm = TRUE)

plot_ly(data = df_bar, type = 'bar') %>%
  add_trace(x = ~year, y = ~tyr, name  = "Years of Education") %>%
  layout(xaxis = list(
    title = 'Year'),
    yaxis = list(
      title = 'Sample Adult Education Levels'))

2.5 Scatterplot: DV + IV

tyr_mean <- df %>% group_by(country) %>% 
  summarize_at(vars(tyr), list(mean), na.rm = TRUE) # compute average education levels

democ_mean <- df %>% group_by(country) %>% 
  summarize_at(vars(democ), list(mean), na.rm = TRUE) # compute average democracy levels

regLine <- lm(democ_mean$democ ~ tyr_mean$tyr) # generate slope

plot_ly(type = 'scatter', mode = 'markers') %>%
  add_trace(x = ~tyr_mean$tyr, y = ~democ_mean$democ, color = ~democ_mean$country, 
            showlegend = FALSE) %>%
  add_trace(x = ~tyr_mean$tyr, y = fitted(regLine), mode = "lines", name = 'Slope', 
            showlegend = FALSE) %>%
  layout(xaxis = list(
    title = 'Avg. Years of Adult Education'),
    yaxis = list(
      title = 'Avg. Country Democracy Levels'))

3 Time Series Cross-Sectional Analysis

Now that we have a better understanding of our data and we can see a clear relationship between education and democracy, let’s specify a hypothesis and test it with an appropriate model.

\(H_{1}\): In a comparison of countries, as the years of the schooling among adults increases, measures of democracy increases.

Notice that the hypothesis implicitly refers to the over time effects of education on the levels of democracy. Thus, we need to specify case fixed effects in order to realize the over time variation. Case fixed effects are apparent in our equation given the absence of \(_{t}\) from country.

\[\begin{eqnarray} \small \text{Democracy}_{it} = \alpha + \beta_{1} \text{Education}_{it} + \beta_{2} \text{GDP}_{it} + \beta_{3} \text{Population}_{it} + \beta_{4} \text{Democracy}_{it-1} + \text{Country}_{i} + \epsilon_{it} \end{eqnarray}\]

The dependent variable is included as a lagged independent variable because despite applying fixed effects to the regression, I have not accounted for any other issue associated with time series data such as auto-correlation. Lagging the dependent variable is one method to account for auto-correlation which means that I am modeling this time series as an Auto Regressive (AR) 1 process. While this makes a large assumption that all 105 countries are identical in their auto-correlation and rate of decay, there are few tools for panel models to improve our estimation currently. Additionally, the lagged dependent variable is also incorporated for an AR1 impulse response function analysis that we will incorporate later.

4 Log-Transform Data

Before running our first model, we need to transform several our variables with the natural log to improve our model’s fit.

df$population <- log(df$population) # apply log
df$gdp <- log(df$gdp) # apply log
df <- df %>% dplyr::select(tyr, democ, gdp, population, country, year) # trim to used variables

5 Case Fixed Effects: Derive Over Time Variation

We can estimate the over time effects because the fixed effects procedure is as if we created a subset of our dataset by country, performed a regression for that subset, then subtracted the cross-sectional variation leaving only the time effects, and then repeated that for every country. After running 105 regressions, we would average the slopes together and weight it by its sample size (the number of countries) and the variance of our independent variables. The effect we derive is a within case over time analysis of itself. A templated interpretation is as follows: Within an average case, one unit increase in \(X\), is associated with a \(\hat{\beta}\) change in \(Y\), on average, after controlling for all other variables.

As we can see, the primary independent variable is statistically significant. The result tells us that: within an average country, a ten year increase in the average years of schooling for citizens aged 15-64 is associated with a 1.7 increase in the polity democracy score, on average, after controlling for GDP and population. This effect is quite notable as the polity democracy score is on an 11 point scale. However, a ten year increase in the average years of schooling is also quite challenging as many developing countries are averaging below 5 years (as of 2010) while all Western nations exceed 10. It is also worth noting that GDP is not statistically significant after controlling for education which challenges the findings of previous research which purports to demonstrate GDP’s very close and significant link with democracy.¹

# Case Fixed Effects - Over Time Variation
pdata1 <- pdata.frame(df, index=c("country", "year"), row.names=FALSE) # case fixed effects
caseFE <- plm(democ ~ tyr + gdp + population + dplyr::lag(democ, k=1), model="within", data=pdata1)
summary(caseFE)

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = democ ~ tyr + gdp + population + dplyr::lag(democ, 
##     k = 1), data = pdata1, model = "within")
## 
## Unbalanced Panel: n = 102, T = 2-28, N = 1408
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -8.894454 -0.686317 -0.084028  0.536275  8.396564 
## 
## Coefficients:
##                          Estimate Std. Error t-value  Pr(>|t|)    
## tyr                      0.168844   0.048388  3.4894 0.0005002 ***
## gdp                      0.109639   0.149019  0.7357 0.4620235    
## population               0.109853   0.172494  0.6369 0.5243321    
## dplyr::lag(democ, k = 1) 0.701194   0.020695 33.8822 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    10152
## Residual Sum of Squares: 4046.2
## R-Squared:      0.60144
## Adj. R-Squared: 0.56929
## F-statistic: 491.181 on 4 and 1302 DF, p-value: < 2.22e-16

5.1 Quick Model Diagnostics

pbgtest(caseFE) # null = no serial correlation; cannot reject; model has no serial correlation of errors

## 
##  Breusch-Godfrey/Wooldridge test for serial correlation in panel models
## 
## data:  democ ~ tyr + gdp + population + dplyr::lag(democ, k = 1)
## chisq = 1.3149, df = 2, p-value = 0.5182
## alternative hypothesis: serial correlation in idiosyncratic errors

bptest(caseFE, studentize = F) # null = there is no heteroskedasticity; can reject; model has heteroskedasticity

## 
##  Breusch-Pagan test
## 
## data:  caseFE
## BP = 112.01, df = 4, p-value < 2.2e-16

6 Impulse Response Function

The impulse response function below is based on a simplified model above which depicts the marginal effect of average years of adult education over time. Substantively, this result tells us: A one year increase in average years of adult schooling for citizens aged 15-64 today (\(t_{=0}\)), is associated with an approximately 0.7 increase in the Polity democracy index one year from now (\(t_{=1}\)). The difference between the IRF and the regression results in the table above is due to the following: a regression represents average treatment effects across time while the IRF depicts average treatment effects at specified time intervals.

## Create model
reg <- plm(democ ~ tyr + dplyr::lag(democ, k=1), model="within", data=pdata1)
summary(reg)

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = democ ~ tyr + dplyr::lag(democ, k = 1), data = pdata1, 
##     model = "within")
## 
## Unbalanced Panel: n = 105, T = 3-29, N = 1878
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -9.00561 -0.59027 -0.09114  0.46019  8.54134 
## 
## Coefficients:
##                          Estimate Std. Error t-value  Pr(>|t|)    
## tyr                      0.168426   0.016428  10.252 < 2.2e-16 ***
## dplyr::lag(democ, k = 1) 0.726113   0.016760  43.325 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    13149
## Residual Sum of Squares: 4803.3
## R-Squared:      0.63469
## Adj. R-Squared: 0.61283
## F-statistic: 1538.48 on 2 and 1771 DF, p-value: < 2.22e-16

## Create loop
iterations <- 10000
results <- data.frame()

coefs <- coef(reg) #store coefs
vcov <- vcov(reg)   #store vcov
sim.irf <- mvrnorm(iterations, coefs, vcov) # simulate new coefs 1000x
head(sim.irf)

##            tyr dplyr::lag(democ, k = 1)
## [1,] 0.1768250                0.7120246
## [2,] 0.1532820                0.7595639
## [3,] 0.1439394                0.7312199
## [4,] 0.2060065                0.7043871
## [5,] 0.1855121                0.7045173
## [6,] 0.1744217                0.7171867

data.sim <- data.frame(sim.irf)


for(i in 1:iterations){
  #print(paste(c("Now working on iteration", i), collapse=" "))
  irf.funx <- c(data.sim[i,2], 
                data.sim[i,2]*data.sim[i,1],
                data.sim[i,2]*data.sim[i,1]^2,
                data.sim[i,2]*data.sim[i,1]^3,
                data.sim[i,2]*data.sim[i,1]^4,
                data.sim[i,2]*data.sim[i,1]^5,
                data.sim[i,2]*data.sim[i,1]^6,
                data.sim[i,2]*data.sim[i,1]^7,
                data.sim[i,2]*data.sim[i,1]^8,
                data.sim[i,2]*data.sim[i,1]^9) # IRF function
  
# for every iteration, pull the IRF values for each time point

# store 1000 IRF results
results <- rbind(results, irf.funx)
}

# Mean IRFs
mean.irfs <- apply(results, 2, mean)
low.ci <- apply(results, 2, FUN=function(x){
  quantile(x, .025)
})
low.ci

##    X0.712024624636414    X0.125903781819358   X0.0222629410949251 
##          6.938696e-01          1.007601e-01          1.375452e-02 
##  X0.00393664542108215 X0.000696097479001088 X0.000123087463675728 
##          1.873446e-03          2.560293e-04          3.493386e-05 
##  X2.1764945530136e.05 X3.84858733605703e.06 X6.80526604707105e.07 
##          4.747940e-06          6.483768e-07          8.834213e-08 
## X1.20334143225824e.07 
##          1.202984e-08

hi.ci <- apply(results, 2, FUN=function(x){
  quantile(x, .975)
})
hi.ci

##    X0.712024624636414    X0.125903781819358   X0.0222629410949251 
##          7.598629e-01          1.431292e-01          2.860344e-02 
##  X0.00393664542108215 X0.000696097479001088 X0.000123087463675728 
##          5.714401e-03          1.143147e-03          2.290501e-04 
##  X2.1764945530136e.05 X3.84858733605703e.06 X6.80526604707105e.07 
##          4.583837e-05          9.171275e-06          1.835012e-06 
## X1.20334143225824e.07 
##          3.671941e-07

irf.data <- data.frame(time = 0:9, irf = mean.irfs, irf.lb = low.ci, irf.ub = hi.ci)

g <- ggplot(data = irf.data, 
       aes(x = time, y = irf, 
           ymin = irf.lb, ymax = irf.ub)) + 
  geom_path() + 
  geom_ribbon(alpha=0.2) + 
  theme_bw() + 
  labs(x = "Year", y = "Marginal Effects: Education")  + 
  theme(legend.position="bottom") + 
  guides(color = guide_legend(ncol=1),fill = guide_legend(nrow=1)) +
  geom_hline(aes(yintercept=0))

g

7 Time Fixed Effects: Derive Cross-Sectional Effects

Over time analysis is not the only option available to us in time series cross-sectional analysis. We can also estimate the cross-sectional effect by choosing where we place our fixed effects. To derive the cross-sectional effect, we use time fixed effects by placing our fixed effects on year.

We can estimate the cross-sectional effects because the fixed effects procedure is as if we created a subset of our dataset by year, performed a regression for that subset, then subtracted the time variation leaving only the cross-sectional effects, and then repeated that for every year. After running 39 regressions, we would average the slopes together and weight it by its sample size (the number of years) and the variance of our independent variables. The effect we derive is a across case analysis – comparing one unit to the next. A templated interpretation is as follows: When comparing one case to another, if one case has \(X\) worth, we expect on average, its \(Y\) will be \(\hat{\beta}\).

The model’s results tells us: When comparing one country to another, if one country’s adult population has 10 years of education, we expect on average, that its polity democracy score will be 6.3 out of 11.

pdata2 <- pdata.frame(df, index=c("year", "country"), row.names=FALSE) # year fixed effects
timeFE <- plm(democ ~ tyr + gdp + population + dplyr::lag(democ, k=1), model="within", data=pdata2)
summary(timeFE)

## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = democ ~ tyr + gdp + population + dplyr::lag(democ, 
##     k = 1), data = pdata2, model = "within")
## 
## Unbalanced Panel: n = 28, T = 7-97, N = 1150
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -7.64459 -2.25500  0.09491  2.34190  8.73758 
## 
## Coefficients:
##                           Estimate Std. Error t-value  Pr(>|t|)    
## tyr                      0.6398062  0.0572493 11.1758 < 2.2e-16 ***
## gdp                      1.0069573  0.1577995  6.3812 2.568e-10 ***
## population               0.0014959  0.0669478  0.0223    0.9822    
## dplyr::lag(democ, k = 1) 0.0143816  0.0236460  0.6082    0.5432    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    19123
## Residual Sum of Squares: 11489
## R-Squared:      0.3992
## Adj. R-Squared: 0.38254
## F-statistic: 185.715 on 4 and 1118 DF, p-value: < 2.22e-16

Boix, Carles. “Democracy, development, and the international system.” American Political Science Review 105, no. 4 (2011): 809-828.↩︎

Time Series Cross-Section Analysis: Democracy and Education

Andrew Fogarty

8/2/2019