Survival models are an important quantitative tool with myriad applications in political research. Within the field of Comparative Politics, survival models are heavily used to model democratic breakdown. Typically, political scientists use survival models to derive the average number of years a democratic regime survives, given a variable of interest. While survival models incorporate many different estimators, the most common is the Cox proportional hazards model, coxph(). This document replicates an Journal of Politics article published by Maeda, titled: Two Modes of Democratic Breakdown: A Competing Risks Analysis of Democratic Durability. Since Maeda’s analysis was conducted in Stata, we will replicate his work in R. His data and Stata analysis can be found on his website.
# load data
df <- read.csv('https://raw.githubusercontent.com/afogarty85/replications/master/Maeda/MaedaCSV.csv')
# review data
summary(df)
## ccode country year month
## Min. : 2.0 Australia : 708 Min. :1946 Min. : 1.00
## 1st Qu.:165.0 Belgium : 708 1st Qu.:1969 1st Qu.: 4.00
## Median :349.0 Canada : 708 Median :1986 Median : 7.00
## Mean :383.6 Costa Rica: 708 Mean :1982 Mean : 6.53
## 3rd Qu.:590.0 Denmark : 708 3rd Qu.:1996 3rd Qu.:10.00
## Max. :950.0 Finland : 708 Max. :2004 Max. :12.00
## (Other) :26119
## demid demend endmode demmonth
## Min. : 1.00 Min. :0.00000 Min. :0.000000 Min. : 1.0
## 1st Qu.: 36.00 1st Qu.:0.00000 1st Qu.:0.000000 1st Qu.: 72.0
## Median : 76.00 Median :0.00000 Median :0.000000 Median : 193.0
## Mean : 81.24 Mean :0.02993 Mean :0.002173 Mean : 366.4
## 3rd Qu.:121.00 3rd Qu.:0.00000 3rd Qu.:0.000000 3rd Qu.: 501.0
## Max. :165.00 Max. :9.00000 Max. :2.000000 Max. :2350.0
##
## polity ope gro urb
## Min. :-88.000 Min. : 1.867 Min. :-41.8000 Min. : 3.80
## 1st Qu.: 8.000 1st Qu.: 29.876 1st Qu.: 0.1192 1st Qu.: 42.80
## Median : 10.000 Median : 48.381 Median : 2.2773 Median : 61.38
## Mean : 8.559 Mean : 58.969 Mean : 2.1649 Mean : 57.52
## 3rd Qu.: 10.000 3rd Qu.: 82.093 3rd Qu.: 4.4593 3rd Qu.: 74.66
## Max. : 10.000 Max. :221.603 Max. : 56.0741 Max. :100.00
## NA's :1288 NA's :1870 NA's :98
## prereg_ind prereg_mil regiondem preparmix
## Min. :0.0000 Min. :0.0000 Min. :-6.9048 Min. :1.000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:-0.2917 1st Qu.:1.000
## Median :0.0000 Median :0.0000 Median : 1.8800 Median :2.000
## Mean :0.2656 Mean :0.1602 Mean : 2.3765 Mean :1.849
## 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.: 6.2000 3rd Qu.:2.000
## Max. :1.0000 Max. :1.0000 Max. : 8.1154 Max. :3.000
##
## enp majgovm presi mixed
## Min. : 1.000 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.: 2.170 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
## Median : 2.830 Median :1.0000 Median :0.0000 Median :0.0000
## Mean : 3.326 Mean :0.7353 Mean :0.2653 Mean :0.1142
## 3rd Qu.: 3.970 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000
## Max. :24.897 Max. :1.0000 Max. :1.0000 Max. :1.0000
## NA's :1508 NA's :145
## ethnic prevend1945 dev lnpop
## Min. :0.0000 Min. :0.0000 Min. : 5.815 Min. : 5.373
## 1st Qu.:0.1145 1st Qu.:0.0000 1st Qu.: 8.312 1st Qu.: 8.153
## Median :0.3200 Median :0.0000 Median : 8.994 Median : 9.103
## Mean :0.3349 Mean :0.1205 Mean : 8.836 Mean : 9.251
## 3rd Qu.:0.5062 3rd Qu.:0.0000 3rd Qu.: 9.592 3rd Qu.:10.509
## Max. :0.9302 Max. :2.0000 Max. :10.460 Max. :13.879
## NA's :1490 NA's :992
## minpresi majpresi minmixed majmixed
## Min. :0.0000 Min. :0.0000 Min. :0.00000 Min. :0.00000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.00000
## Median :0.0000 Median :0.0000 Median :0.00000 Median :0.00000
## Mean :0.1051 Mean :0.1596 Mean :0.03203 Mean :0.08269
## 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.00000
## Max. :1.0000 Max. :1.0000 Max. :1.00000 Max. :1.00000
## NA's :145 NA's :145 NA's :145 NA's :145
## minparli tdev tgro tpresi
## Min. :0.000 Min. : 0.00 Min. :-139.2861 Min. :0.000
## 1st Qu.:0.000 1st Qu.:35.86 1st Qu.: 0.3969 1st Qu.:0.000
## Median :0.000 Median :46.27 Median : 11.7815 Median :0.000
## Mean :0.127 Mean :46.58 Mean : 11.6359 Mean :1.301
## 3rd Qu.:0.000 3rd Qu.:59.31 3rd Qu.: 23.3607 3rd Qu.:2.398
## Max. :1.000 Max. :81.19 Max. : 210.8187 Max. :7.762
## NA's :1490 NA's :1870
## tmixed tmajgovm tethnic tope
## Min. :0.0000 Min. :0.000 Min. :0.0000 Min. : 0.0
## 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.6392 1st Qu.: 146.4
## Median :0.0000 Median :4.575 Median :1.3214 Median : 249.9
## Mean :0.5119 Mean :3.761 Mean :1.6464 Mean : 300.1
## 3rd Qu.:0.0000 3rd Qu.:5.869 3rd Qu.:2.5090 3rd Qu.: 423.4
## Max. :6.5848 Max. :7.762 Max. :5.1599 Max. :1289.1
## NA's :145 NA's :1288
## turb tlnpop tprereg_ind tprereg_mil
## Min. : 0.0 Min. : 0.00 Min. :0.000 Min. :0.0000
## 1st Qu.:173.9 1st Qu.:37.17 1st Qu.:0.000 1st Qu.:0.0000
## Median :302.4 Median :47.60 Median :0.000 Median :0.0000
## Mean :308.6 Mean :47.98 Mean :1.353 Mean :0.7157
## 3rd Qu.:430.2 3rd Qu.:57.52 3rd Qu.:2.773 3rd Qu.:0.0000
## Max. :654.9 Max. :97.77 Max. :7.129 Max. :6.3456
## NA's :98 NA's :992
## tregiondem lnt postcw tpostcw
## Min. :-41.837 Min. :0.000 Min. :0.0000 Min. :0.000
## 1st Qu.: -1.211 1st Qu.:4.277 1st Qu.:0.0000 1st Qu.:0.000
## Median : 9.743 Median :5.263 Median :0.0000 Median :0.000
## Mean : 12.743 Mean :5.137 Mean :0.3945 Mean :2.002
## 3rd Qu.: 26.899 3rd Qu.:6.217 3rd Qu.:1.0000 3rd Qu.:4.682
## Max. : 58.810 Max. :7.762 Max. :1.0000 Max. :7.762
##
## impose
## Min. :0.0000
## 1st Qu.:0.0000
## Median :0.0000
## Mean :0.2133
## 3rd Qu.:0.0000
## Max. :1.0000
##
Since Maeda did not provide a codebook, so we have to infer some of the values by reviewing the Stata “do” file and the dataset. The dependent variable is demend, which takes on two values, 0 if the democracy did not end, and 1 if it did. The 9 value used is a censoring value of sorts, as it the last demend entry for each country over time. The table below shows that democratic breakdown is indeed rare, as our sample captures only 45 different democratic breakdowns across 108 countries from 1950 to 2004.
table(df$demend)
##
## 0 1 9
## 30226 45 96
We can visualize the duration of democratic regimes by plotting demmonth as a histogram. It shows that the prospects for democracy are not great since nearly half of the regimes in our sample do not survive past 16 years.
df_hist <- df %>% group_by(country) %>% summarize_at(vars(demmonth), list(max))
plot_ly(data = df_hist, type = 'histogram') %>%
add_trace(x = ~demmonth, name = "Duration of Survival", nbinsx = 40) %>%
layout(xaxis = list(
title = '# of Months'),
yaxis = list(
title = 'Count'))
Since we have panel data, there are a few extra steps we must take so that it can be analyzed correctly. What we are going to do next is replicate the stset
command in Stata by creating two variables, a start_time
and end_time
, which increases in count every observation until breakdown (failure). Once a democracy breaksdown, the counter should reset. To do this easily, we need a variable in our dataset that represents a unique number for each country during its survival spell. If a democratic regime fails multiple times within a single country, it should have multiple unique numbers. Maeda created a variable, demid, that does just this. The start_time
and end_time
calculation proceeds as follows:
## start time
df <- df %>%
group_by(demid) %>%
mutate(start_time = 0:(n() - 1))
df <- ungroup(df)
## end time
df <- df %>%
group_by(demid) %>%
mutate(end_time = 1:(n()))
df <- ungroup(df)
Now we can display our first survival curve. The Kaplan-Meier Curve displays the number of democracies still surviving at each month of measurement. It plots our sample’s dependent variable while taking into account no predictors. The graph is interpreted as follows: At \(x\) months, the probability an average democratic regime is still surviving is approximately \(x\)%.
km.curve <- Surv(time = df$start_time, time2 = df$end_time, event = df$demend)
## Warning in Surv(time = df$start_time, time2 = df$end_time, event = df$demend):
## Invalid status value, converted to NA
plot(km.curve, main="Kaplan-Meier Plot", xlab="Months", ylab="Probability")
The cumulative hazard function expresses the risk of democratic regime breakdown. The graph is interpreted as follows: At \(x\) months, the risk an average democratic regime will suffer from breakdown is approximately \(x\) %.
km.curve <- Surv(time = df$start_time, time2 = df$end_time, event = df$demend)
## Warning in Surv(time = df$start_time, time2 = df$end_time, event = df$demend):
## Invalid status value, converted to NA
plot(km.curve, main="Kaplan-Meier Plot", xlab="Months", ylab="Probability", fun = "cumhaz")
We use a Cox proportional hazard model when our dependent variable consists of observations of durations. Cox models assume a proportional hazards assumption, an assumption that we will test later, which means that the baseline hazard does not vary across observations. Before fitting our first model, we need to make a few more alterations to replicate Maeda’s results.
# make a new DV since Maeda used 9 for censors that Stata ignores but R does not
df <- df %>% mutate(demend2 = ifelse(demend == 9, 0, demend))
Next, we replicate Maeda’s first model. Our results are close, but not a perfect match, which is generally expected when comparing replications conducted across statistical platforms.
cox.model <- coxph(Surv(time = df$start_time, time2 = df$end_time, event = df$demend2) ~ dev + gro
+ presi + mixed + majgovm + ethnic + ope + urb + prereg_ind
+ prereg_mil + regiondem + postcw + impose
+ cluster(country), method = 'efron', robust = TRUE, data=df)
summary(cox.model)
## Call:
## coxph(formula = Surv(time = df$start_time, time2 = df$end_time,
## event = df$demend2) ~ dev + gro + presi + mixed + majgovm +
## ethnic + ope + urb + prereg_ind + prereg_mil + regiondem +
## postcw + impose, data = df, robust = TRUE, method = "efron",
## cluster = country)
##
## n= 28468, number of events= 42
## (1899 observations deleted due to missingness)
##
## coef exp(coef) se(coef) robust se z Pr(>|z|)
## dev -0.795809 0.451216 0.326401 0.308732 -2.578 0.00995 **
## gro -0.075652 0.927139 0.024925 0.031597 -2.394 0.01665 *
## presi 0.483778 1.622191 0.515306 0.685603 0.706 0.48042
## mixed 0.323106 1.381412 0.661309 0.963750 0.335 0.73743
## majgovm -0.570004 0.565523 0.365604 0.328207 -1.737 0.08244 .
## ethnic -0.863002 0.421894 0.817360 0.789523 -1.093 0.27436
## ope -0.008674 0.991363 0.004823 0.005051 -1.717 0.08592 .
## urb -0.013321 0.986768 0.014351 0.012816 -1.039 0.29861
## prereg_ind 0.173090 1.188973 0.539798 0.697114 0.248 0.80391
## prereg_mil 0.033813 1.034391 0.440014 0.410861 0.082 0.93441
## regiondem -0.076098 0.926726 0.073657 0.078256 -0.972 0.33084
## postcw -0.016099 0.984029 0.492071 0.420049 -0.038 0.96943
## impose -0.136393 0.872499 0.505563 0.360944 -0.378 0.70552
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## dev 0.4512 2.2162 0.24637 0.8264
## gro 0.9271 1.0786 0.87146 0.9864
## presi 1.6222 0.6165 0.42317 6.2185
## mixed 1.3814 0.7239 0.20892 9.1342
## majgovm 0.5655 1.7683 0.29722 1.0760
## ethnic 0.4219 2.3703 0.08978 1.9827
## ope 0.9914 1.0087 0.98160 1.0012
## urb 0.9868 1.0134 0.96229 1.0119
## prereg_ind 1.1890 0.8411 0.30324 4.6618
## prereg_mil 1.0344 0.9668 0.46234 2.3142
## regiondem 0.9267 1.0791 0.79495 1.0803
## postcw 0.9840 1.0162 0.43198 2.2416
## impose 0.8725 1.1461 0.43006 1.7701
##
## Concordance= 0.782 (se = 0.038 )
## Likelihood ratio test= 50.6 on 13 df, p=2e-06
## Wald test = 71.22 on 13 df, p=5e-10
## Score (logrank) test = 53.88 on 13 df, p=6e-07, Robust = 28.96 p=0.007
##
## (Note: the likelihood ratio and score tests assume independence of
## observations within a cluster, the Wald and robust score tests do not).
We use a NPSF to generate myriad statistics and make comparisons between variables of interest. Since we have panel data, we need to specify id in the function below.
npsf.model <- coxed(cox.model, method="npsf", id = df$country) # fit NPSF
By using the exp.dur
attribute, we can retrieve the predicted duration for each country
head(npsf.model$exp.dur)
## exp.dur
## 1 630.0588
## 2 630.0588
## 3 630.0588
## 4 630.0588
## 5 630.0588
## 6 630.0588
We can also generate the mean or median for our sample, by specifying the proper statistic. The predicted mean duration of democratic regime survival is 543 months or 45 years.
summary(npsf.model, stat="mean")
## mean
## 543.205
summary(npsf.model, stat="median")
## median
## 608.54
The NPSF also provides useful means for deriving the cumulative baseline hazard function and the baseline survivor function. We retrieve this data by specifying the baseline.functions
attribute.
head(npsf.model$baseline.functions)
## time cbh survivor
## 1 1 0 1
## 2 2 0 1
## 3 3 0 1
## 4 4 0 1
## 5 5 0 1
## 6 6 0 1
We can also readily graph this data in a few steps. Following a similar interpretation from the Kaplan-Meier Curve section, cbh
tells us the risk of a democratic regime suffering from breakdown over time; assuming all covariates are 0. The next panel, survivor
, tells us the probability that a democratic regime is still surviving by \(x\) months.
baseline <- gather(npsf.model$baseline.functions, cbh, survivor, key="survivefunction", value="value")
ggplot(baseline, aes(x=time, y=value)) +
geom_line() +
xlab("Time") +
ylab("Function") +
facet_wrap( ~ survivefunction, scales = "free")
We can also generate out of sample predictions. Let’s say we want to know how long a hypothetical or actual democratic regime is likely to survive that was not included in our dataset. We can specify the exact value for our variable of interest, in this case dev
(GDP) following Maeda, while specifying all other values remain constant at our sample’s mean.
new.country <- data.frame(dev = 9.5,
gro = mean(na.omit(df$gro)),
presi = mean(na.omit(df$presi)),
mixed = mean(na.omit(df$mixed)),
majgovm = mean(na.omit(df$majgovm)),
ethnic = mean(na.omit(df$ethnic)),
ope = mean(na.omit(df$ope)),
urb = mean(na.omit(df$urb)),
prereg_ind = mean(na.omit(df$prereg_ind)),
prereg_mil = mean(na.omit(df$prereg_mil)),
regiondem = mean(na.omit(df$regiondem)),
postcw = mean(na.omit(df$postcw)),
impose = mean(na.omit(df$impose)))
forecast <- coxed(cox.model, newdata=new.country, method="npsf", id = df$country, bootstrap=TRUE, B=30)
forecast$exp.dur
## exp.dur bootstrap.se lb ub
## 1 637.6422 13.04342 612.0775 663.2068
We can also compare two values within our variable of interest. Let’s say we are interested in determining the difference in democratic survival duration between a country with high GDP and a country with low GDP. We can specify two new dataframes where all cases receive either high or low GDP. We use the min and max as sufficient values for high and low. We would increase B to iterate many more times for more reliable results.
min(na.omit(df$dev))
## [1] 5.815288
max(na.omit(df$dev))
## [1] 10.45954
marginal.changes <- coxed(cox.model, method = "npsf", id = df$country, bootstrap = TRUE, B = 5,
newdata = dplyr::mutate(df, dev = 5.8),
newdata2 = dplyr::mutate(df, dev = 10.5))
Our results are interpreted as follows: A democratic regime with a high GDP such that dev
= 10.5 will survive for 500 more months, on average, than a democratic regime with a low GDP, such that dev
= 5.8.
summary(marginal.changes, stat="mean")
## mean bootstrap.se lb ub
## newdata2 667.094 11.371 644.807 689.382
## newdata 166.637 41.724 84.860 248.415
## difference 500.457 30.890 439.914 561.000
summary(marginal.changes, stat="median")
## median bootstrap.se lb ub
## newdata2 674.118 9.761 654.986 693.250
## newdata 133.152 45.854 43.279 223.025
## difference 538.173 33.832 471.863 604.482
We can also visualize the marginal changes over time with the help of survminer
. What we are doing in the code below is comparing the difference in survival duration between two democratic regimes with high and low GDP, while holding all other variables constant at the sample mean.
# Create the new data
new_data <- data.frame(dev = c(5.8, 10.5),
gro = mean(na.omit(df$gro)),
presi = mean(na.omit(df$presi)),
mixed = mean(na.omit(df$mixed)),
majgovm = mean(na.omit(df$majgovm)),
ethnic = mean(na.omit(df$ethnic)),
ope = mean(na.omit(df$ope)),
urb = mean(na.omit(df$urb)),
prereg_ind = mean(na.omit(df$prereg_ind)),
prereg_mil = mean(na.omit(df$prereg_mil)),
regiondem = mean(na.omit(df$regiondem)),
postcw = mean(na.omit(df$postcw)),
impose = mean(na.omit(df$impose)))
new_data
## dev gro presi mixed majgovm ethnic ope urb
## 1 5.8 2.164882 0.2653209 0.11417 0.7353253 0.3349172 58.96876 57.52368
## 2 10.5 2.164882 0.2653209 0.11417 0.7353253 0.3349172 58.96876 57.52368
## prereg_ind prereg_mil regiondem postcw impose
## 1 0.2656173 0.1601739 2.376494 0.3944743 0.2133237
## 2 0.2656173 0.1601739 2.376494 0.3944743 0.2133237
# Survival curves
library(survminer)
fit <- coxph(Surv(time = df$start_time, time2 = df$end_time, event = df$demend2) ~ dev + gro
+ presi + mixed + majgovm + ethnic + ope + urb + prereg_ind
+ prereg_mil + regiondem + postcw + impose
+ cluster(country), method = 'efron', robust = TRUE, data=df)
fit <- survfit(fit, newdata = new_data)
ggsurvplot(fit, data = new_data, conf.int = TRUE, palette = "Dark2",
censor = FALSE, surv.median.line = "hv", legend.labs=c("dev=5.8", "dev=10.5"))
In principle, the Schoenfeld residuals are independent of time. A plot that shows a non-random pattern against time is evidence of violation of the proportional hazard assumption. Visually, we are looking for non-straight line, indicating violation of proportional hazards assumption. Statistically, we are looking for p-values less than 0.5 in our hypothesis tests. We see violations in urb
, regiondem
, and postcw
. Violations can generate biased coefficients and the general solution for this problem is to: (1) identify the violations, and (2) interact the violations with the natural log of time.
test.ph <- cox.zph(cox.model) # non significant relationship is ideal
test.ph
## chisq df p
## dev 3.94640 1 0.0470
## gro 2.81261 1 0.0935
## presi 1.55204 1 0.2128
## mixed 0.45430 1 0.5003
## majgovm 0.00152 1 0.9689
## ethnic 8.54187 1 0.0035
## ope 1.06680 1 0.3017
## urb 0.30112 1 0.5832
## prereg_ind 0.93000 1 0.3349
## prereg_mil 0.11179 1 0.7381
## regiondem 3.77238 1 0.0521
## postcw 1.95969 1 0.1615
## impose 0.83260 1 0.3615
## GLOBAL 25.04080 13 0.0228
ggcoxzph(test.ph)
We can also use a stepwise algorithm to help us find the best fitting model if we wanted to completely discard the theory and qualitative nature behind Maeda’s model selection.
df2 <- na.omit(df) # a little heavy-handed, but stepAIC will not work with NAs
cox.model <- coxph(Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~ dev + gro
+ presi + mixed + majgovm + ethnic + ope + urb + prereg_ind
+ prereg_mil + regiondem + postcw + impose
+ cluster(country), method = 'efron', robust = TRUE, data=df2)
stepwise2 <- stepAIC(cox.model, direction = 'backward')
## Start: AIC=289.48
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~
## dev + gro + presi + mixed + majgovm + ethnic + ope + urb +
## prereg_ind + prereg_mil + regiondem + postcw + impose
##
## Df AIC
## - postcw 1 287.51
## - mixed 1 287.52
## - impose 1 287.54
## - prereg_mil 1 287.57
## - prereg_ind 1 287.58
## - presi 1 288.80
## - urb 1 288.80
## - regiondem 1 288.80
## - majgovm 1 288.90
## - ethnic 1 288.96
## <none> 289.49
## - ope 1 291.25
## - gro 1 292.46
## - dev 1 295.35
##
## Step: AIC=287.51
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~
## dev + gro + presi + mixed + majgovm + ethnic + ope + urb +
## prereg_ind + prereg_mil + regiondem + impose
##
## Df AIC
## - mixed 1 285.55
## - impose 1 285.57
## - prereg_ind 1 285.58
## - prereg_mil 1 285.60
## - urb 1 286.80
## - presi 1 286.85
## - majgovm 1 286.91
## - ethnic 1 287.10
## - regiondem 1 287.46
## <none> 287.51
## - ope 1 290.02
## - gro 1 290.52
## - dev 1 293.65
##
## Step: AIC=285.55
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~
## dev + gro + presi + majgovm + ethnic + ope + urb + prereg_ind +
## prereg_mil + regiondem + impose
##
## Df AIC
## - prereg_ind 1 283.60
## - impose 1 283.62
## - prereg_mil 1 283.62
## - urb 1 284.86
## - majgovm 1 284.95
## - ethnic 1 285.33
## <none> 285.55
## - presi 1 285.87
## - regiondem 1 285.88
## - ope 1 288.44
## - gro 1 288.54
## - dev 1 291.66
##
## Step: AIC=283.6
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~
## dev + gro + presi + majgovm + ethnic + ope + urb + prereg_mil +
## regiondem + impose
##
## Df AIC
## - impose 1 281.64
## - prereg_mil 1 281.73
## - urb 1 282.87
## - majgovm 1 283.01
## - ethnic 1 283.44
## <none> 283.60
## - regiondem 1 283.88
## - presi 1 284.35
## - gro 1 286.56
## - ope 1 286.86
## - dev 1 289.68
##
## Step: AIC=281.64
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~
## dev + gro + presi + majgovm + ethnic + ope + urb + prereg_mil +
## regiondem
##
## Df AIC
## - prereg_mil 1 279.75
## - urb 1 280.92
## - majgovm 1 281.08
## - ethnic 1 281.45
## <none> 281.64
## - regiondem 1 282.12
## - presi 1 282.36
## - gro 1 284.56
## - ope 1 284.88
## - dev 1 287.72
##
## Step: AIC=279.75
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~
## dev + gro + presi + majgovm + ethnic + ope + urb + regiondem
##
## Df AIC
## - urb 1 278.93
## - majgovm 1 279.17
## - ethnic 1 279.46
## <none> 279.75
## - regiondem 1 280.31
## - presi 1 280.61
## - gro 1 282.72
## - ope 1 283.20
## - dev 1 285.91
##
## Step: AIC=278.93
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~
## dev + gro + presi + majgovm + ethnic + ope + regiondem
##
## Df AIC
## <none> 278.93
## - presi 1 278.94
## - ethnic 1 279.23
## - majgovm 1 279.24
## - regiondem 1 280.45
## - ope 1 281.73
## - gro 1 281.74
## - dev 1 304.10
stepwise2$anova
## Stepwise Model Path
## Analysis of Deviance Table
##
## Initial Model:
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~
## dev + gro + presi + mixed + majgovm + ethnic + ope + urb +
## prereg_ind + prereg_mil + regiondem + postcw + impose
##
## Final Model:
## Surv(time = df2$start_time, time2 = df2$end_time, event = df2$demend2) ~
## dev + gro + presi + majgovm + ethnic + ope + regiondem
##
##
## Step Df Deviance Resid. Df Resid. Dev AIC
## 1 25 263.4849 289.4849
## 2 - postcw 1 0.02542789 26 263.5104 287.5104
## 3 - mixed 1 0.04260455 27 263.5530 285.5530
## 4 - prereg_ind 1 0.04331389 28 263.5963 283.5963
## 5 - impose 1 0.04159345 29 263.6379 281.6379
## 6 - prereg_mil 1 0.10816804 30 263.7460 279.7460
## 7 - urb 1 1.18651517 31 264.9325 278.9325
Kropko, Jonathan. PLAD 8320: Generalized Linear Models. University of Virginia.
Maeda, Ko. “Two modes of democratic breakdown: A competing risks analysis of democratic durability.” The Journal of Politics 72, no. 4 (2010): 1129-1143.