Introduction

Survival models are an important quantitative tool with myriad applications in political research. Within the field of Comparative Politics, survival models are heavily used to model democratic breakdown. Typically, political scientists use survival models to derive the average number of years a democratic regime survives, given a variable of interest. While survival models incorporate many different estimators, the most common is the Cox proportional hazards model, coxph(). This document replicates an Journal of Politics article published by Maeda, titled: Two Modes of Democratic Breakdown: A Competing Risks Analysis of Democratic Durability. Since Maeda’s analysis was conducted in Stata, we will replicate his work in R. His data and Stata analysis can be found on his website.

Data

# load data
df <- read.csv('https://raw.githubusercontent.com/afogarty85/replications/master/Maeda/MaedaCSV.csv')

# review data
summary(df)

##      ccode             country           year          month      
##  Min.   :  2.0   Australia :  708   Min.   :1946   Min.   : 1.00  
##  1st Qu.:165.0   Belgium   :  708   1st Qu.:1969   1st Qu.: 4.00  
##  Median :349.0   Canada    :  708   Median :1986   Median : 7.00  
##  Mean   :383.6   Costa Rica:  708   Mean   :1982   Mean   : 6.53  
##  3rd Qu.:590.0   Denmark   :  708   3rd Qu.:1996   3rd Qu.:10.00  
##  Max.   :950.0   Finland   :  708   Max.   :2004   Max.   :12.00  
##                  (Other)   :26119                                 
##      demid            demend           endmode            demmonth     
##  Min.   :  1.00   Min.   :0.00000   Min.   :0.000000   Min.   :   1.0  
##  1st Qu.: 36.00   1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:  72.0  
##  Median : 76.00   Median :0.00000   Median :0.000000   Median : 193.0  
##  Mean   : 81.24   Mean   :0.02993   Mean   :0.002173   Mean   : 366.4  
##  3rd Qu.:121.00   3rd Qu.:0.00000   3rd Qu.:0.000000   3rd Qu.: 501.0  
##  Max.   :165.00   Max.   :9.00000   Max.   :2.000000   Max.   :2350.0  
##                                                                        
##      polity             ope               gro                urb        
##  Min.   :-88.000   Min.   :  1.867   Min.   :-41.8000   Min.   :  3.80  
##  1st Qu.:  8.000   1st Qu.: 29.876   1st Qu.:  0.1192   1st Qu.: 42.80  
##  Median : 10.000   Median : 48.381   Median :  2.2773   Median : 61.38  
##  Mean   :  8.559   Mean   : 58.969   Mean   :  2.1649   Mean   : 57.52  
##  3rd Qu.: 10.000   3rd Qu.: 82.093   3rd Qu.:  4.4593   3rd Qu.: 74.66  
##  Max.   : 10.000   Max.   :221.603   Max.   : 56.0741   Max.   :100.00  
##                    NA's   :1288      NA's   :1870       NA's   :98      
##    prereg_ind       prereg_mil       regiondem         preparmix    
##  Min.   :0.0000   Min.   :0.0000   Min.   :-6.9048   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:-0.2917   1st Qu.:1.000  
##  Median :0.0000   Median :0.0000   Median : 1.8800   Median :2.000  
##  Mean   :0.2656   Mean   :0.1602   Mean   : 2.3765   Mean   :1.849  
##  3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.: 6.2000   3rd Qu.:2.000  
##  Max.   :1.0000   Max.   :1.0000   Max.   : 8.1154   Max.   :3.000  
##                                                                     
##       enp            majgovm           presi            mixed       
##  Min.   : 1.000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.: 2.170   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median : 2.830   Median :1.0000   Median :0.0000   Median :0.0000  
##  Mean   : 3.326   Mean   :0.7353   Mean   :0.2653   Mean   :0.1142  
##  3rd Qu.: 3.970   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
##  Max.   :24.897   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##  NA's   :1508     NA's   :145                                       
##      ethnic        prevend1945          dev             lnpop       
##  Min.   :0.0000   Min.   :0.0000   Min.   : 5.815   Min.   : 5.373  
##  1st Qu.:0.1145   1st Qu.:0.0000   1st Qu.: 8.312   1st Qu.: 8.153  
##  Median :0.3200   Median :0.0000   Median : 8.994   Median : 9.103  
##  Mean   :0.3349   Mean   :0.1205   Mean   : 8.836   Mean   : 9.251  
##  3rd Qu.:0.5062   3rd Qu.:0.0000   3rd Qu.: 9.592   3rd Qu.:10.509  
##  Max.   :0.9302   Max.   :2.0000   Max.   :10.460   Max.   :13.879  
##                                    NA's   :1490     NA's   :992     
##     minpresi         majpresi         minmixed          majmixed      
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.0000   Median :0.0000   Median :0.00000   Median :0.00000  
##  Mean   :0.1051   Mean   :0.1596   Mean   :0.03203   Mean   :0.08269  
##  3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.00000  
##  NA's   :145      NA's   :145      NA's   :145       NA's   :145      
##     minparli          tdev            tgro               tpresi     
##  Min.   :0.000   Min.   : 0.00   Min.   :-139.2861   Min.   :0.000  
##  1st Qu.:0.000   1st Qu.:35.86   1st Qu.:   0.3969   1st Qu.:0.000  
##  Median :0.000   Median :46.27   Median :  11.7815   Median :0.000  
##  Mean   :0.127   Mean   :46.58   Mean   :  11.6359   Mean   :1.301  
##  3rd Qu.:0.000   3rd Qu.:59.31   3rd Qu.:  23.3607   3rd Qu.:2.398  
##  Max.   :1.000   Max.   :81.19   Max.   : 210.8187   Max.   :7.762  
##                  NA's   :1490    NA's   :1870                       
##      tmixed          tmajgovm        tethnic            tope       
##  Min.   :0.0000   Min.   :0.000   Min.   :0.0000   Min.   :   0.0  
##  1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.6392   1st Qu.: 146.4  
##  Median :0.0000   Median :4.575   Median :1.3214   Median : 249.9  
##  Mean   :0.5119   Mean   :3.761   Mean   :1.6464   Mean   : 300.1  
##  3rd Qu.:0.0000   3rd Qu.:5.869   3rd Qu.:2.5090   3rd Qu.: 423.4  
##  Max.   :6.5848   Max.   :7.762   Max.   :5.1599   Max.   :1289.1  
##                   NA's   :145                      NA's   :1288    
##       turb           tlnpop       tprereg_ind     tprereg_mil    
##  Min.   :  0.0   Min.   : 0.00   Min.   :0.000   Min.   :0.0000  
##  1st Qu.:173.9   1st Qu.:37.17   1st Qu.:0.000   1st Qu.:0.0000  
##  Median :302.4   Median :47.60   Median :0.000   Median :0.0000  
##  Mean   :308.6   Mean   :47.98   Mean   :1.353   Mean   :0.7157  
##  3rd Qu.:430.2   3rd Qu.:57.52   3rd Qu.:2.773   3rd Qu.:0.0000  
##  Max.   :654.9   Max.   :97.77   Max.   :7.129   Max.   :6.3456  
##  NA's   :98      NA's   :992                                     
##    tregiondem           lnt            postcw          tpostcw     
##  Min.   :-41.837   Min.   :0.000   Min.   :0.0000   Min.   :0.000  
##  1st Qu.: -1.211   1st Qu.:4.277   1st Qu.:0.0000   1st Qu.:0.000  
##  Median :  9.743   Median :5.263   Median :0.0000   Median :0.000  
##  Mean   : 12.743   Mean   :5.137   Mean   :0.3945   Mean   :2.002  
##  3rd Qu.: 26.899   3rd Qu.:6.217   3rd Qu.:1.0000   3rd Qu.:4.682  
##  Max.   : 58.810   Max.   :7.762   Max.   :1.0000   Max.   :7.762  
##                                                                    
##      impose      
##  Min.   :0.0000  
##  1st Qu.:0.0000  
##  Median :0.0000  
##  Mean   :0.2133  
##  3rd Qu.:0.0000  
##  Max.   :1.0000  
##

Since Maeda did not provide a codebook, so we have to infer some of the values by reviewing the Stata “do” file and the dataset. The dependent variable is demend, which takes on two values, 0 if the democracy did not end, and 1 if it did. The 9 value used is a censoring value of sorts, as it the last demend entry for each country over time. The table below shows that democratic breakdown is indeed rare, as our sample captures only 45 different democratic breakdowns across 108 countries from 1950 to 2004.

table(df$demend)

## 
##     0     1     9 
## 30226    45    96

We can visualize the duration of democratic regimes by plotting demmonth as a histogram. It shows that the prospects for democracy are not great since nearly half of the regimes in our sample do not survive past 16 years.

df_hist <- df %>% group_by(country) %>% summarize_at(vars(demmonth), list(max))

plot_ly(data = df_hist, type = 'histogram') %>%
  add_trace(x = ~demmonth, name  = "Duration of Survival", nbinsx = 40) %>%
  layout(xaxis = list(
    title = '# of Months'),
    yaxis = list(
      title = 'Count'))

Kaplan-Meier Curve

Since we have panel data, there are a few extra steps we must take so that it can be analyzed correctly. What we are going to do next is replicate the stset command in Stata by creating two variables, a start_time and end_time, which increases in count every observation until breakdown (failure). Once a democracy breaksdown, the counter should reset. To do this easily, we need a variable in our dataset that represents a unique number for each country during its survival spell. If a democratic regime fails multiple times within a single country, it should have multiple unique numbers. Maeda created a variable, demid, that does just this. The start_time and end_time calculation proceeds as follows:

## start time
df <- df %>%
  group_by(demid) %>%
  mutate(start_time = 0:(n() - 1))
df <- ungroup(df)

## end time
df <- df %>%
  group_by(demid) %>%
  mutate(end_time = 1:(n()))
df <- ungroup(df)

Now we can display our first survival curve. The Kaplan-Meier Curve displays the number of democracies still surviving at each month of measurement. It plots our sample’s dependent variable while taking into account no predictors. The graph is interpreted as follows: At \(x\) months, the probability an average democratic regime is still surviving is approximately \(x\)%.

km.curve <- Surv(time = df$start_time, time2 = df$end_time, event = df$demend)

## Warning in Surv(time = df$start_time, time2 = df$end_time, event = df$demend):
## Invalid status value, converted to NA

plot(km.curve, main="Kaplan-Meier Plot", xlab="Months", ylab="Probability")

The cumulative hazard function expresses the risk of democratic regime breakdown. The graph is interpreted as follows: At \(x\) months, the risk an average democratic regime will suffer from breakdown is approximately \(x\) %.

km.curve <- Surv(time = df$start_time, time2 = df$end_time, event = df$demend)

## Warning in Surv(time = df$start_time, time2 = df$end_time, event = df$demend):
## Invalid status value, converted to NA

plot(km.curve, main="Kaplan-Meier Plot", xlab="Months", ylab="Probability", fun = "cumhaz")

Cox Proportional Hazards Model

We use a Cox proportional hazard model when our dependent variable consists of observations of durations. Cox models assume a proportional hazards assumption, an assumption that we will test later, which means that the baseline hazard does not vary across observations. Before fitting our first model, we need to make a few more alterations to replicate Maeda’s results.

# make a new DV since Maeda used 9 for censors that Stata ignores but R does not
df <- df %>% mutate(demend2 = ifelse(demend == 9, 0, demend))

Next, we replicate Maeda’s first model. Our results are close, but not a perfect match, which is generally expected when comparing replications conducted across statistical platforms.

cox.model <- coxph(Surv(time = df$start_time, time2 = df$end_time, event = df$demend2) ~ dev + gro 
                   + presi + mixed + majgovm + ethnic + ope + urb + prereg_ind 
                   + prereg_mil + regiondem + postcw + impose 
                   + cluster(country), method = 'efron', robust = TRUE, data=df)
summary(cox.model)

## Call:
## coxph(formula = Surv(time = df$start_time, time2 = df$end_time, 
##     event = df$demend2) ~ dev + gro + presi + mixed + majgovm + 
##     ethnic + ope + urb + prereg_ind + prereg_mil + regiondem + 
##     postcw + impose, data = df, robust = TRUE, method = "efron", 
##     cluster = country)
## 
##   n= 28468, number of events= 42 
##    (1899 observations deleted due to missingness)
## 
##                 coef exp(coef)  se(coef) robust se      z Pr(>|z|)   
## dev        -0.795809  0.451216  0.326401  0.308732 -2.578  0.00995 **
## gro        -0.075652  0.927139  0.024925  0.031597 -2.394  0.01665 * 
## presi       0.483778  1.622191  0.515306  0.685603  0.706  0.48042   
## mixed       0.323106  1.381412  0.661309  0.963750  0.335  0.73743   
## majgovm    -0.570004  0.565523  0.365604  0.328207 -1.737  0.08244 . 
## ethnic     -0.863002  0.421894  0.817360  0.789523 -1.093  0.27436   
## ope        -0.008674  0.991363  0.004823  0.005051 -1.717  0.08592 . 
## urb        -0.013321  0.986768  0.014351  0.012816 -1.039  0.29861   
## prereg_ind  0.173090  1.188973  0.539798  0.697114  0.248  0.80391   
## prereg_mil  0.033813  1.034391  0.440014  0.410861  0.082  0.93441   
## regiondem  -0.076098  0.926726  0.073657  0.078256 -0.972  0.33084   
## postcw     -0.016099  0.984029  0.492071  0.420049 -0.038  0.96943   
## impose     -0.136393  0.872499  0.505563  0.360944 -0.378  0.70552   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##            exp(coef) exp(-coef) lower .95 upper .95
## dev           0.4512     2.2162   0.24637    0.8264
## gro           0.9271     1.0786   0.87146    0.9864
## presi         1.6222     0.6165   0.42317    6.2185
## mixed         1.3814     0.7239   0.20892    9.1342
## majgovm       0.5655     1.7683   0.29722    1.0760
## ethnic        0.4219     2.3703   0.08978    1.9827
## ope           0.9914     1.0087   0.98160    1.0012
## urb           0.9868     1.0134   0.96229    1.0119
## prereg_ind    1.1890     0.8411   0.30324    4.6618
## prereg_mil    1.0344     0.9668   0.46234    2.3142
## regiondem     0.9267     1.0791   0.79495    1.0803
## postcw        0.9840     1.0162   0.43198    2.2416
## impose        0.8725     1.1461   0.43006    1.7701
## 
## Concordance= 0.782  (se = 0.038 )
## Likelihood ratio test= 50.6  on 13 df,   p=2e-06
## Wald test            = 71.22  on 13 df,   p=5e-10
## Score (logrank) test = 53.88  on 13 df,   p=6e-07,   Robust = 28.96  p=0.007
## 
##   (Note: the likelihood ratio and score tests assume independence of
##      observations within a cluster, the Wald and robust score tests do not).

Non Parametric Step Function (NPSF) Analysis

We use a NPSF to generate myriad statistics and make comparisons between variables of interest. Since we have panel data, we need to specify id in the function below.

npsf.model <- coxed(cox.model, method="npsf", id = df$country) # fit NPSF

By using the exp.dur attribute, we can retrieve the predicted duration for each country

head(npsf.model$exp.dur)

##    exp.dur
## 1 630.0588
## 2 630.0588
## 3 630.0588
## 4 630.0588
## 5 630.0588
## 6 630.0588

We can also generate the mean or median for our sample, by specifying the proper statistic. The predicted mean duration of democratic regime survival is 543 months or 45 years.

summary(npsf.model, stat="mean")

##    mean 
## 543.205

summary(npsf.model, stat="median")

## median 
## 608.54

The NPSF also provides useful means for deriving the cumulative baseline hazard function and the baseline survivor function. We retrieve this data by specifying the baseline.functions attribute.

head(npsf.model$baseline.functions)

##   time cbh survivor
## 1    1   0        1
## 2    2   0        1
## 3    3   0        1
## 4    4   0        1
## 5    5   0        1
## 6    6   0        1

We can also readily graph this data in a few steps. Following a similar interpretation from the Kaplan-Meier Curve section, cbh tells us the risk of a democratic regime suffering from breakdown over time; assuming all covariates are 0. The next panel, survivor, tells us the probability that a democratic regime is still surviving by \(x\) months.

baseline <- gather(npsf.model$baseline.functions, cbh, survivor, key="survivefunction", value="value")
ggplot(baseline, aes(x=time, y=value)) +
     geom_line() +
     xlab("Time") +
     ylab("Function") +
     facet_wrap( ~ survivefunction, scales = "free")

Out of Sample Predictions

We can also generate out of sample predictions. Let’s say we want to know how long a hypothetical or actual democratic regime is likely to survive that was not included in our dataset. We can specify the exact value for our variable of interest, in this case dev (GDP) following Maeda, while specifying all other values remain constant at our sample’s mean.

new.country <- data.frame(dev = 9.5,
                            gro = mean(na.omit(df$gro)),
                            presi = mean(na.omit(df$presi)),
                            mixed = mean(na.omit(df$mixed)),
                            majgovm = mean(na.omit(df$majgovm)),
                            ethnic = mean(na.omit(df$ethnic)),
                            ope = mean(na.omit(df$ope)),
                            urb = mean(na.omit(df$urb)),
                            prereg_ind = mean(na.omit(df$prereg_ind)),
                            prereg_mil = mean(na.omit(df$prereg_mil)),
                            regiondem = mean(na.omit(df$regiondem)),
                            postcw = mean(na.omit(df$postcw)),
                            impose = mean(na.omit(df$impose)))

forecast <- coxed(cox.model, newdata=new.country, method="npsf", id = df$country, bootstrap=TRUE, B=30)
forecast$exp.dur

##    exp.dur bootstrap.se       lb       ub
## 1 637.6422     13.04342 612.0775 663.2068

Marginal Changes in Expected Duration

We can also compare two values within our variable of interest. Let’s say we are interested in determining the difference in democratic survival duration between a country with high GDP and a country with low GDP. We can specify two new dataframes where all cases receive either high or low GDP. We use the min and max as sufficient values for high and low. We would increase B to iterate many more times for more reliable results.

min(na.omit(df$dev))

## [1] 5.815288

max(na.omit(df$dev))

## [1] 10.45954

marginal.changes <- coxed(cox.model, method = "npsf", id = df$country, bootstrap = TRUE, B = 5,
            newdata =  dplyr::mutate(df, dev = 5.8),
            newdata2 = dplyr::mutate(df, dev = 10.5))

Our results are interpreted as follows: A democratic regime with a high GDP such that dev = 10.5 will survive for 500 more months, on average, than a democratic regime with a low GDP, such that dev = 5.8.

summary(marginal.changes, stat="mean")

##               mean bootstrap.se      lb      ub
## newdata2   667.094       11.371 644.807 689.382
## newdata    166.637       41.724  84.860 248.415
## difference 500.457       30.890 439.914 561.000

summary(marginal.changes, stat="median")

##             median bootstrap.se      lb      ub
## newdata2   674.118        9.761 654.986 693.250
## newdata    133.152       45.854  43.279 223.025
## difference 538.173       33.832 471.863 604.482

Survival Curves

We can also visualize the marginal changes over time with the help of survminer. What we are doing in the code below is comparing the difference in survival duration between two democratic regimes with high and low GDP, while holding all other variables constant at the sample mean.

# Create the new data  
new_data <- data.frame(dev = c(5.8, 10.5),
                            gro = mean(na.omit(df$gro)),
                            presi = mean(na.omit(df$presi)),
                            mixed = mean(na.omit(df$mixed)),
                            majgovm = mean(na.omit(df$majgovm)),
                            ethnic = mean(na.omit(df$ethnic)),
                            ope = mean(na.omit(df$ope)),
                            urb = mean(na.omit(df$urb)),
                            prereg_ind = mean(na.omit(df$prereg_ind)),
                            prereg_mil = mean(na.omit(df$prereg_mil)),
                            regiondem = mean(na.omit(df$regiondem)),
                            postcw = mean(na.omit(df$postcw)),
                            impose = mean(na.omit(df$impose)))
new_data

##    dev      gro     presi   mixed   majgovm    ethnic      ope      urb
## 1  5.8 2.164882 0.2653209 0.11417 0.7353253 0.3349172 58.96876 57.52368
## 2 10.5 2.164882 0.2653209 0.11417 0.7353253 0.3349172 58.96876 57.52368
##   prereg_ind prereg_mil regiondem    postcw    impose
## 1  0.2656173  0.1601739  2.376494 0.3944743 0.2133237
## 2  0.2656173  0.1601739  2.376494 0.3944743 0.2133237

# Survival curves
library(survminer)

fit <- coxph(Surv(time = df$start_time, time2 = df$end_time, event = df$demend2) ~ dev + gro 
                   + presi + mixed + majgovm + ethnic + ope + urb + prereg_ind 
                   + prereg_mil + regiondem + postcw + impose 
                   + cluster(country), method = 'efron', robust = TRUE, data=df)

fit <- survfit(fit, newdata = new_data)

ggsurvplot(fit, data = new_data, conf.int = TRUE, palette = "Dark2", 
           censor = FALSE, surv.median.line = "hv", legend.labs=c("dev=5.8", "dev=10.5"))

Check Cox Assumptions

In principle, the Schoenfeld residuals are independent of time. A plot that shows a non-random pattern against time is evidence of violation of the proportional hazard assumption. Visually, we are looking for non-straight line, indicating violation of proportional hazards assumption. Statistically, we are looking for p-values less than 0.5 in our hypothesis tests. We see violations in urb, regiondem, and postcw. Violations can generate biased coefficients and the general solution for this problem is to: (1) identify the violations, and (2) interact the violations with the natural log of time.

test.ph <- cox.zph(cox.model) # non significant relationship is ideal
test.ph

##               chisq df      p
## dev         3.94640  1 0.0470
## gro         2.81261  1 0.0935
## presi       1.55204  1 0.2128
## mixed       0.45430  1 0.5003
## majgovm     0.00152  1 0.9689
## ethnic      8.54187  1 0.0035
## ope         1.06680  1 0.3017
## urb         0.30112  1 0.5832
## prereg_ind  0.93000  1 0.3349
## prereg_mil  0.11179  1 0.7381
## regiondem   3.77238  1 0.0521
## postcw      1.95969  1 0.1615
## impose      0.83260  1 0.3615
## GLOBAL     25.04080 13 0.0228

ggcoxzph(test.ph)

Stepwise Model Selection

We can also use a stepwise algorithm to help us find the best fitting model if we wanted to completely discard the theory and qualitative nature behind Maeda’s model selection.

df2 <- na.omit(df) # a little heavy-handed, but stepAIC will not work with NAs

cox.model <- coxph(Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ dev + gro 
                   + presi + mixed + majgovm + ethnic + ope + urb + prereg_ind 
                   + prereg_mil + regiondem + postcw + impose 
                   + cluster(country), method = 'efron', robust = TRUE, data=df2)

stepwise2 <- stepAIC(cox.model, direction = 'backward')

## Start:  AIC=289.48
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ 
##     dev + gro + presi + mixed + majgovm + ethnic + ope + urb + 
##         prereg_ind + prereg_mil + regiondem + postcw + impose
## 
##              Df    AIC
## - postcw      1 287.51
## - mixed       1 287.52
## - impose      1 287.54
## - prereg_mil  1 287.57
## - prereg_ind  1 287.58
## - presi       1 288.80
## - urb         1 288.80
## - regiondem   1 288.80
## - majgovm     1 288.90
## - ethnic      1 288.96
## <none>          289.49
## - ope         1 291.25
## - gro         1 292.46
## - dev         1 295.35
## 
## Step:  AIC=287.51
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ 
##     dev + gro + presi + mixed + majgovm + ethnic + ope + urb + 
##         prereg_ind + prereg_mil + regiondem + impose
## 
##              Df    AIC
## - mixed       1 285.55
## - impose      1 285.57
## - prereg_ind  1 285.58
## - prereg_mil  1 285.60
## - urb         1 286.80
## - presi       1 286.85
## - majgovm     1 286.91
## - ethnic      1 287.10
## - regiondem   1 287.46
## <none>          287.51
## - ope         1 290.02
## - gro         1 290.52
## - dev         1 293.65
## 
## Step:  AIC=285.55
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ 
##     dev + gro + presi + majgovm + ethnic + ope + urb + prereg_ind + 
##         prereg_mil + regiondem + impose
## 
##              Df    AIC
## - prereg_ind  1 283.60
## - impose      1 283.62
## - prereg_mil  1 283.62
## - urb         1 284.86
## - majgovm     1 284.95
## - ethnic      1 285.33
## <none>          285.55
## - presi       1 285.87
## - regiondem   1 285.88
## - ope         1 288.44
## - gro         1 288.54
## - dev         1 291.66
## 
## Step:  AIC=283.6
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ 
##     dev + gro + presi + majgovm + ethnic + ope + urb + prereg_mil + 
##         regiondem + impose
## 
##              Df    AIC
## - impose      1 281.64
## - prereg_mil  1 281.73
## - urb         1 282.87
## - majgovm     1 283.01
## - ethnic      1 283.44
## <none>          283.60
## - regiondem   1 283.88
## - presi       1 284.35
## - gro         1 286.56
## - ope         1 286.86
## - dev         1 289.68
## 
## Step:  AIC=281.64
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ 
##     dev + gro + presi + majgovm + ethnic + ope + urb + prereg_mil + 
##         regiondem
## 
##              Df    AIC
## - prereg_mil  1 279.75
## - urb         1 280.92
## - majgovm     1 281.08
## - ethnic      1 281.45
## <none>          281.64
## - regiondem   1 282.12
## - presi       1 282.36
## - gro         1 284.56
## - ope         1 284.88
## - dev         1 287.72
## 
## Step:  AIC=279.75
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ 
##     dev + gro + presi + majgovm + ethnic + ope + urb + regiondem
## 
##             Df    AIC
## - urb        1 278.93
## - majgovm    1 279.17
## - ethnic     1 279.46
## <none>         279.75
## - regiondem  1 280.31
## - presi      1 280.61
## - gro        1 282.72
## - ope        1 283.20
## - dev        1 285.91
## 
## Step:  AIC=278.93
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ 
##     dev + gro + presi + majgovm + ethnic + ope + regiondem
## 
##             Df    AIC
## <none>         278.93
## - presi      1 278.94
## - ethnic     1 279.23
## - majgovm    1 279.24
## - regiondem  1 280.45
## - ope        1 281.73
## - gro        1 281.74
## - dev        1 304.10

stepwise2$anova

## Stepwise Model Path 
## Analysis of Deviance Table
## 
## Initial Model:
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ 
##     dev + gro + presi + mixed + majgovm + ethnic + ope + urb + 
##         prereg_ind + prereg_mil + regiondem + postcw + impose
## 
## Final Model:
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ 
##     dev + gro + presi + majgovm + ethnic + ope + regiondem
## 
## 
##           Step Df   Deviance Resid. Df Resid. Dev      AIC
## 1                                   25   263.4849 289.4849
## 2     - postcw  1 0.02542789        26   263.5104 287.5104
## 3      - mixed  1 0.04260455        27   263.5530 285.5530
## 4 - prereg_ind  1 0.04331389        28   263.5963 283.5963
## 5     - impose  1 0.04159345        29   263.6379 281.6379
## 6 - prereg_mil  1 0.10816804        30   263.7460 279.7460
## 7        - urb  1 1.18651517        31   264.9325 278.9325

Sources

Kropko, Jonathan. PLAD 8320: Generalized Linear Models. University of Virginia.
Maeda, Ko. “Two modes of democratic breakdown: A competing risks analysis of democratic durability.” The Journal of Politics 72, no. 4 (2010): 1129-1143.

Survival Analysis

Andrew Fogarty

4/25/2020