1 Introduction

In this analysis, we replicate portions of an article by Kwon (2015): Does Radical Partisan Politics Affect National Income Distributions? Congressional Polarization and Income Inequality in the United States, 1913–2008. In his research, Kwon investigates the determinants of political polarization and its impact on income inequality. Kwon’s research is interesting because it offers multiple long-run time series variables and theorized explanations for us to investigate.

2 Data

# load data
df <- read.csv('https://raw.githubusercontent.com/afogarty85/replications/master/Kwon/kwon.csv')

We can see that we have a case-year time series dataset, with only a few NAs, ranging the years from 1913-2008.

summary(df)

##       year           time       toptenpercent   topfivepercent  toponepercent  
##  Min.   :1913   Min.   : 1.00   Min.   :31.38   Min.   :20.37   Min.   : 7.74  
##  1st Qu.:1937   1st Qu.:24.75   1st Qu.:32.29   1st Qu.:21.14   1st Qu.: 8.71  
##  Median :1960   Median :48.50   Median :37.29   Median :25.77   Median :12.84  
##  Mean   :1960   Mean   :48.50   Mean   :37.54   Mean   :26.12   Mean   :12.62  
##  3rd Qu.:1984   3rd Qu.:72.25   3rd Qu.:42.96   3rd Qu.:30.94   3rd Qu.:15.87  
##  Max.   :2008   Max.   :96.00   Max.   :46.30   Max.   :34.77   Max.   :19.60  
##                                 NA's   :4       NA's   :4                      
##  toppointonepercent     gdppc          labinag           secedu     
##  Min.   :1.890      Min.   : 4777   Min.   : 1.400   Min.   :21.23  
##  1st Qu.:2.342      1st Qu.: 6575   1st Qu.: 3.205   1st Qu.:65.69  
##  Median :4.710      Median :11807   Median : 8.250   Median :88.62  
##  Mean   :4.555      Mean   :14351   Mean   :11.817   Mean   :76.55  
##  3rd Qu.:6.162      3rd Qu.:20271   3rd Qu.:21.900   3rd Qu.:92.41  
##  Max.   :9.870      Max.   :31357   Max.   :30.200   Max.   :97.21  
##                                                                     
##  Manufacturing    femalelabor     importsbygdp      housepolar    
##  Min.   :19.90   Min.   :22.11   Min.   : 1.760   Min.   :0.4210  
##  1st Qu.:29.27   1st Qu.:25.09   1st Qu.: 2.953   1st Qu.:0.4730  
##  Median :37.60   Median :34.77   Median : 4.715   Median :0.5410  
##  Mean   :34.62   Mean   :38.43   Mean   : 6.712   Mean   :0.6196  
##  3rd Qu.:39.80   3rd Qu.:53.42   3rd Qu.:10.700   3rd Qu.:0.7762  
##  Max.   :41.10   Max.   :59.50   Max.   :17.880   Max.   :0.9820  
##                                                                   
##   senatepolar    
##  Min.   :0.3000  
##  1st Qu.:0.4547  
##  Median :0.5640  
##  Mean   :0.5488  
##  3rd Qu.:0.6640  
##  Max.   :0.7870  
##

The NA values are from 1913-1916 and since there is little we can do about that, we will remove them.

df <- na.omit(df)

In his article, Kwon notes that political polarization and income equality are reaching new highs in the United States. To make his case, he produces two graphs: one showing changes in political polarization over time and one showing changes in income equality over time. We replicate his line graphs as follows:

# polarization
p1 <- plot_ly(data = df, type = 'scatter', mode = 'lines') %>%
  add_trace(x =~year, y=~housepolar, name = 'House Polarization') %>%
  add_trace(x =~year, y=~senatepolar, name = 'Senate Polarization')

# inequality
p2 <- plot_ly(data = df, type = 'scatter', mode = 'lines') %>%
  add_trace(x =~year, y=~toponepercent, name = 'Top 1%') %>%
  add_trace(x =~year, y=~toppointonepercent, name = 'Top 0.1%')

subplot(p1, p2) %>% 
  layout(title = 'Congressional Polarization and Income Shares of Top Earners in the U.S., 1913-2008',
         font = list(size = 10))

3 Understanding Autocorrelation

When we have time series variables, the general way to think about our data’s time dependence is in terms of autocorrelation. A time series is autocorrelated if it is correlated with its own lags. This means that the value of the series a specified time point depends on its value during the last time point.

By graphing our time series variables, we can visually inspect, but not definitively prove, whether or not our data is autocorrelated. We can think about two broad terms when considering autocorrelation: stationary and non-stationary.

Stationary: the time series tends to return to the same long-term mean.
Non-stationary: the time series has no long term mean and shocks are everlasting. Other terms include: unit-root, random walk, integrated, and infinite memory.

AR, MA, and ARMA are all (stationary) processes in which shocks are eventually forgotten and the system returns to normal.

AR - Long-term memory: How long after a shock does it take for the system to return to its mean?
MA - Short-term memory: How long does a shock last?
ARMA - Autoregressive and Moving Average. ARMA models memory. Memory is the speed with which the time series forgets the shock and returns to its mean.

3.1 Understanding Autocorrelation: Autoregressive AR(p) processes

All four series seem to follow a pattern whereby they reach peaks before 1930 and after 1990 and where they are lower between these years. Lending credence to our suspicion, Kwon notes in a footnote that his Dickey-Fuller tests found that all variables were non-stationary when analyzed.

Before following Kwon’s lead, let’s first examine our time series data and see if can find a AR(p) process ongoing within the data where $p$ = the number of lagged dependent variables we include. The model for this process is the following: $y_{t} = \text{c} + \delta{y_{t-1}} + \epsilon_{t}$. $\delta$ here is our measure for memory such that:

If $\delta$ $\epsilon$ (0,1): the series is stationary and lags are less influential as they become more distant from each other
If $\delta$ = 1: the series is integrated and all lags have equal influence no matter how distant they become
If $\delta$ > 1: the series is explosive and old lags are more influential as they become more distant

3.2 Understanding Autocorrelation: Moving average MA(q) processes

We can also examine our time series data for moving average processes which means that $Y_{t}$ is a rolling average of the last $q$ shocks. The MA(q) process determines the number of lagged error terms we include. The model for this process is the following:

\[\begin{equation} y_{t} = c + \epsilon_{t},\\ \epsilon_{t} = \gamma{\epsilon_{t-1}} + v_{t} \end{equation}\]

$c$ is the mean of $Y$
$\gamma$ is the estimated continued influence of old shocks
$v_{t}$ is white noise

3.3 Combining ARMA

Given data where AR and MA processes both exist, we combine our two components above to produce the ARMA model as follows:

\[\begin{equation} y_{t} = c + \delta{y_{t-1}} + \epsilon_{t},\\ \epsilon_{t} = \gamma{\epsilon_{t-1}} + v_{t} \end{equation}\]

4 Autocorrelation Analysis - Dependent Variable Lags

Now that we have a few of the details out of the way, what we are trying to do here is to find the correct ARMA model that fits our dependent variable $Y_{t}$, such that it removes the time trends so we can see the true relationships between our variables. A good ARMA model produces errors that are white noise, meaning $Y$ has a mean $c$ and any departures from it appear random.

To choose the right number of lags on the dependent variable, we do a bit of guess-and-checking. Again, we are tabling the fact that we probably do not have stationary time series here, but the purpose is worthwhile as we will see.

We begin by plotting one of our time series variable’s autocorrelation function acf. The acf shows the autocorrelation of $Y_{t}$ and some lag $Y_{t-p}$. AR processes show persistent autocorrelation between lags but the autocorrelation decays exponentially.

4.1 Autocorrelation Analysis - Dependent Variable: housepolar

acf(df$housepolar)

The pacf shows the remaining partial autocorrelation after controlling for all closer lags. For an AR(1) process, (one lagged dependent variable), only the first partial autocorrelation should be different from 0.

pacf(df$housepolar)

What we see here visually follows the autocorrelation expectation for AR(p) processes as shown in the table below:

Lags	AC	PACF
1	Exponential Decay	1 significant
2	Slower exponential Decay	2 significant
3	Even slower exponential Decay	3 significant

For comparative purposes, the expectation for MA(q) processes is the opposite, as seen from the table below:

Lags	AC	PACF
1	1 significant	Exponential decay
2	2 significant	Exponential decay
3	3 significant	Exponential decay

The acf for housepolar shows exponential decay while the pacf for housepolar shows one significant value suggesting a AR(1) process.

With the details out of the way, we can proceed with examining the other time series dependent variables quickly.

4.2 Autocorrelation Analysis - Dependent Variable: senatepolar

acf(df$senatepolar)

pacf(df$senatepolar)

The acf for senatepolar shows exponential decay while the pacf for senatepolar shows one significant value suggesting a AR(1) process.

4.3 Autocorrelation Analysis - Dependent Variable: toponepercent

acf(df$toponepercent)

pacf(df$toponepercent)

The acf for toponepercent shows exponential decay while the pacf for toponepercent shows one significant value suggesting a AR(1) process.

4.4 Autocorrelation Analysis - Dependent Variable: toppointonepercent

acf(df$toppointonepercent)

pacf(df$toppointonepercent)

The acf for toppointonepercent shows exponential decay while the pacf for toppointonepercent shows one significant value suggesting a AR(1) process.

4.5 Autocorrelation Analysis - Integration

Looping back to where we were a bit earlier, we know from reading Kwon’s article that his time series variables are non-stationary, meaning we cannot directly use ARMA, VAR, or granger tests. We cannot use ARMA because the variable never forgets any shock and thus cannot return to its true mean. We cannot use VAR and granger because we cannot separate the effect of X on Y or Y on X from what both variables would do absent the shock.

The general approach then is once realizing we have non-stationary time series, we typically take the first (or perhaps more) difference of the non-stationary (integrated) variable. Let’s test for integration.

The ur.df command from urca tests for integration. Our function includes a trend and drift term, allows for up to 5 lags, and lets AIC select the best number of lags.

summary(ur.df(df$housepolar, type="trend", lags=5, selectlags="AIC"))

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.037137 -0.007257 -0.000763  0.006737  0.054079 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.0010328  0.0077372  -0.133 0.894150    
## z.lag.1     -0.0402363  0.0125576  -3.204 0.001954 ** 
## tt           0.0005481  0.0001313   4.175 7.61e-05 ***
## z.diff.lag1 -0.1820656  0.0994862  -1.830 0.071013 .  
## z.diff.lag2  0.1442395  0.0980404   1.471 0.145203    
## z.diff.lag3 -0.1111293  0.0971973  -1.143 0.256351    
## z.diff.lag4  0.3475601  0.0974527   3.566 0.000618 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01583 on 79 degrees of freedom
## Multiple R-squared:  0.5907, Adjusted R-squared:  0.5596 
## F-statistic:    19 on 6 and 79 DF,  p-value: 1.439e-13
## 
## 
## Value of test-statistic is: -3.2041 6.8637 10.2256 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2  6.50  4.88  4.16
## phi3  8.73  6.49  5.47

The test-statistic is -3.20 while the 5% critical value is -3.45. Since our test-statistic is not less than then critical value, we cannot reject the presence of a unit root.

-3.20 < -3.45

## [1] FALSE

summary(ur.df(df$senatepolar, type="trend", lags=5, selectlags="AIC"))

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.070903 -0.008059 -0.002153  0.006828  0.067572 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.0161525  0.0094536   1.709 0.091449 .  
## z.lag.1     -0.0833662  0.0225124  -3.703 0.000393 ***
## tt           0.0005975  0.0001475   4.050 0.000119 ***
## z.diff.lag1 -0.1049644  0.1009902  -1.039 0.301812    
## z.diff.lag2  0.1390117  0.1016012   1.368 0.175125    
## z.diff.lag3 -0.0657459  0.1002743  -0.656 0.513947    
## z.diff.lag4  0.2574454  0.1003665   2.565 0.012209 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02154 on 79 degrees of freedom
## Multiple R-squared:  0.3607, Adjusted R-squared:  0.3122 
## F-statistic:  7.43 on 6 and 79 DF,  p-value: 2.52e-06
## 
## 
## Value of test-statistic is: -3.7031 6.1855 9.2422 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2  6.50  4.88  4.16
## phi3  8.73  6.49  5.47

The test-statistic is -3.70 while the 5% critical value is -3.45. Since our test-statistic is not less than then critical value, we cannot reject the presence of a unit root.

3.7031 < -3.45

## [1] FALSE

summary(ur.df(df$toponepercent, type="trend", lags=5, selectlags="AIC"))

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7896 -0.3938 -0.0660  0.3829  2.2788 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.073955   0.388521  -0.190    0.850
## z.lag.1     -0.016900   0.024444  -0.691    0.491
## tt           0.005956   0.003607   1.651    0.103
## z.diff.lag   0.182883   0.111105   1.646    0.104
## 
## Residual standard error: 0.7796 on 82 degrees of freedom
## Multiple R-squared:  0.09113,    Adjusted R-squared:  0.05788 
## F-statistic: 2.741 on 3 and 82 DF,  p-value: 0.0485
## 
## 
## Value of test-statistic is: -0.6914 1.3523 2.0227 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2  6.50  4.88  4.16
## phi3  8.73  6.49  5.47

The test-statistic is -0.6914 while the 5% critical value is -3.45. Since our test-statistic is not less than then critical value, we cannot reject the presence of a unit root.

-0.6914 < -3.45

## [1] FALSE

summary(ur.df(df$toppointonepercent, type="trend", lags=5, selectlags="AIC"))

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.04497 -0.16714 -0.05504  0.17805  1.23009 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -0.089863   0.157090  -0.572   0.5689  
## z.lag.1     -0.017314   0.024913  -0.695   0.4890  
## tt           0.003626   0.001972   1.838   0.0696 .
## z.diff.lag   0.170845   0.111738   1.529   0.1301  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4358 on 82 degrees of freedom
## Multiple R-squared:  0.08969,    Adjusted R-squared:  0.05639 
## F-statistic: 2.693 on 3 and 82 DF,  p-value: 0.05143
## 
## 
## Value of test-statistic is: -0.695 1.4541 2.1469 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2  6.50  4.88  4.16
## phi3  8.73  6.49  5.47

The test-statistic is -0.6914 while the 5% critical value is -3.45. Since our test-statistic is not less than then critical value, we cannot reject the presence of a unit root.

-0.695 < -3.45

## [1] FALSE

4.6 Autocorrelation Analysis - Remove Integration

Kwon notes in his article that in order to remove integration, he takes the first difference, but also transforms his variables by the natural log. Here, we follow his lead and check whether or not a unit root still exists.

# transform variables
df_transformed <- data.frame(year = df$year[-1], 
                             housepolar = diff(log(df$housepolar)),
                             senatepolar=diff(log(df$senatepolar)),
                             toponepercent=diff(log(df$toponepercent)),
                             toppointonepercent=diff(log(df$toppointonepercent)))

summary(ur.df(df_transformed$housepolar, type="trend", lags=5, selectlags="AIC"))

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.079681 -0.012152 -0.000658  0.007332  0.078211 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.0187661  0.0103268  -1.817  0.07297 .  
## z.lag.1     -0.6455027  0.2244172  -2.876  0.00517 ** 
## tt           0.0004397  0.0002114   2.079  0.04082 *  
## z.diff.lag1 -0.4504694  0.1865352  -2.415  0.01805 *  
## z.diff.lag2 -0.3467573  0.1526704  -2.271  0.02585 *  
## z.diff.lag3 -0.4386704  0.0989781  -4.432 2.97e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0296 on 79 degrees of freedom
## Multiple R-squared:  0.7296, Adjusted R-squared:  0.7125 
## F-statistic: 42.63 on 5 and 79 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -2.8764 2.8252 4.1788 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2  6.50  4.88  4.16
## phi3  8.73  6.49  5.47

The test-statistic is -2.87 while the 5% critical value is -3.45. Since our test-statistic is not less than then critical value, we still cannot reject the presence of a unit root.

-2.8764 < -3.45

## [1] FALSE

summary(ur.df(df_transformed$senatepolar, type="trend", lags=5, selectlags="AIC"))

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.177213 -0.007777 -0.002078  0.005970  0.188282 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -0.0104903  0.0123884  -0.847  0.39967   
## z.lag.1     -0.5274828  0.1847101  -2.856  0.00548 **
## tt           0.0002637  0.0002370   1.113  0.26921   
## z.diff.lag1 -0.4986855  0.1682364  -2.964  0.00401 **
## z.diff.lag2 -0.2508139  0.1534500  -1.634  0.10613   
## z.diff.lag3 -0.2766557  0.1064506  -2.599  0.01116 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04687 on 79 degrees of freedom
## Multiple R-squared:  0.6342, Adjusted R-squared:  0.611 
## F-statistic: 27.39 on 5 and 79 DF,  p-value: 5.671e-16
## 
## 
## Value of test-statistic is: -2.8557 2.7883 4.1265 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2  6.50  4.88  4.16
## phi3  8.73  6.49  5.47

The test-statistic is -2.85 while the 5% critical value is -3.45. Since our test-statistic is not less than then critical value, we still cannot reject the presence of a unit root.

-2.8557 < -3.45

## [1] FALSE

summary(ur.df(df_transformed$toponepercent, type="trend", lags=5, selectlags="AIC"))

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.129027 -0.035057 -0.001507  0.031757  0.156936 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.0215842  0.0141088  -1.530   0.1300    
## z.lag.1     -0.8863569  0.1412722  -6.274 1.63e-08 ***
## tt           0.0004876  0.0002662   1.832   0.0707 .  
## z.diff.lag   0.0943206  0.1090943   0.865   0.3898    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05659 on 81 degrees of freedom
## Multiple R-squared:  0.4099, Adjusted R-squared:  0.388 
## F-statistic: 18.75 on 3 and 81 DF,  p-value: 2.499e-09
## 
## 
## Value of test-statistic is: -6.2741 13.1587 19.7356 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2  6.50  4.88  4.16
## phi3  8.73  6.49  5.47

The test-statistic is -6.27 while the 5% critical value is -3.45. Since our test-statistic is less than then critical value, we can reject the presence of a unit root.

-6.2741 < -3.45

## [1] TRUE

summary(ur.df(df_transformed$toppointonepercent, type="trend", lags=5, selectlags="AIC"))

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.170419 -0.037480 -0.002025  0.038548  0.259664 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.0345148  0.0215122  -1.604   0.1125    
## z.lag.1     -0.8809930  0.1422115  -6.195 2.29e-08 ***
## tt           0.0007966  0.0004077   1.954   0.0542 .  
## z.diff.lag   0.0799741  0.1091563   0.733   0.4659    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.086 on 81 degrees of freedom
## Multiple R-squared:  0.4097, Adjusted R-squared:  0.3878 
## F-statistic: 18.74 on 3 and 81 DF,  p-value: 2.525e-09
## 
## 
## Value of test-statistic is: -6.1949 12.823 19.2335 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2  6.50  4.88  4.16
## phi3  8.73  6.49  5.47

The test-statistic is -6.19 while the 5% critical value is -3.45. Since our test-statistic is less than then critical value, we can reject the presence of a unit root.

-6.1949 < -3.45

## [1] TRUE

5 Analysis - Vector Autoregression (VAR)

VAR is a model of cross-correlations which are correlations of one variable with the lags of another. A VAR shows us first whether an effect from one variable to the other exists, then how this effect is processed by the variable over time before it returns to its long-term mean.

We can view cross-correlations by using the ccf function. While not definitive, it is exploratory in the sense that we can view patterns in correlations. If there was no pattern, we could probably exclude any possibility of temporal dependence between two variables. The graph below suggests a temporal relationship exists between congressional house polarization and the income share of the top one percent.

ccf(df$toponepercent, df$housepolar)

Granger models ask whether or not the recent history of $X$ effects the current value of $Y$ beyond what is expected from the recent history of $Y$. Its null hypothesis is that none of the lags of $X$ have an effect on current $Y$. If:

the Granger test for $X$ $\rightarrow$ $Y$ is significant, and
the Granger test for $Y$ $\rightarrow$ $X$ is not,

Then we can say that $X$ Granger-causes $Y$. However, we must remember that we might still be omitted variable bias, that our ARMA model is wrong, or we have a extraordinary sample.

A VAR is an extension of Granger causality because it allows us to incorporate many time series variables to better address causality and address omitted variable bias.

VARs offer a nice time series analogue to marginal change which his an impulse response function (IRF). An IRF states:

A one-unit increase in $X$ at time $t$ is associated with a $\phi$ change in $Y$ at time $t$ + $k$.

Kwon then incorporates four VARs to analyze the association between the independent variable’s past values and the dependent variable. Before replicating his models, we will first use VARselect to help determine the amount of lags we should include in our model. We begin by selecting two of the variables used in the models and then conclude by running VARselect.

df1 <- dplyr::select(df, housepolar, toponepercent)
VARselect(df1) # 5 lags

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      5      5      3      5 
## 
## $criteria
##                    1             2             3             4             5
## AIC(n) -8.2142380121 -8.2511961240 -8.4797238873 -8.4267094298 -8.6043950384
## HQ(n)  -8.1435359910 -8.1333594221 -8.3147525046 -8.2146033664 -8.3451542943
## SC(n)  -8.0381366038 -7.9576937768 -8.0688206012 -7.8984052048 -7.9586898745
## FPE(n)  0.0002707885  0.0002610253  0.0002078091  0.0002193299  0.0001838958
##                    6             7             8             9            10
## AIC(n) -8.5922084498 -8.5522397476 -8.4880242426 -8.4392356062 -8.4197000017
## HQ(n)  -8.2858330249 -8.1987296420 -8.0873794561 -7.9914561390 -7.9247858538
## SC(n)  -7.8291023470 -7.6717327060 -7.4901162620 -7.3239266867 -7.1869901433
## FPE(n)  0.0001865495  0.0001947273  0.0002084467  0.0002199537  0.0002256828

Using VARselect, the function suggests we use 5 lags.

df2 <- dplyr::select(df, housepolar, toppointonepercent)
VARselect(df2) # 6 lags

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      6      5      3      5 
## 
## $criteria
##                    1             2             3             4             5
## AIC(n) -9.365798e+00 -9.408495e+00 -9.639702e+00 -9.588119e+00 -9.766228e+00
## HQ(n)  -9.295096e+00 -9.290658e+00 -9.474730e+00 -9.376013e+00 -9.506987e+00
## SC(n)  -9.189696e+00 -9.114992e+00 -9.228798e+00 -9.059815e+00 -9.120523e+00
## FPE(n)  8.560794e-05  8.204919e-05  6.514672e-05  6.866004e-05  5.754322e-05
##                    6             7             8             9            10
## AIC(n) -9.767245e+00 -9.755548e+00 -9.686305e+00 -9.659928e+00 -9.681492e+00
## HQ(n)  -9.460869e+00 -9.402037e+00 -9.285660e+00 -9.212148e+00 -9.186578e+00
## SC(n)  -9.004139e+00 -8.875041e+00 -8.688397e+00 -8.544619e+00 -8.448782e+00
## FPE(n)  5.760794e-05  5.845706e-05  6.289098e-05  6.489205e-05  6.390122e-05

Using VARselect, the function suggests we use 6 lags.

df3 <- dplyr::select(df, senatepolar, toppointonepercent)
VARselect(df3) # 3 lags

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      3      3      1      3 
## 
## $criteria
##                    1             2             3             4             5
## AIC(n) -9.0365484711 -9.1071195087 -9.1798646083 -9.1360191536 -9.1535366677
## HQ(n)  -8.9658464499 -8.9892828069 -9.0148932257 -8.9239130902 -8.8942959235
## SC(n)  -8.8604470627 -8.8136171615 -8.7689613222 -8.6077149286 -8.5078315038
## FPE(n)  0.0001189886  0.0001109072  0.0001031804  0.0001079067  0.0001061898
##                    6             7             8             9            10
## AIC(n) -9.1580195328 -9.1184422562 -9.0776256472 -9.0763777547 -9.0403627987
## HQ(n)  -8.8516441079 -8.7649321505 -8.6769808607 -8.6285982875 -8.5454486508
## SC(n)  -8.3949134300 -8.2379352145 -8.0797176666 -7.9610688352 -7.8076529404
## FPE(n)  0.0001059413  0.0001105422  0.0001155937  0.0001163119  0.0001213244

Using VARselect, the function suggests we use 3 lags.

df4 <- dplyr::select(df, senatepolar, toppointonepercent)
VARselect(df4) # 3 lags

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      3      3      1      3 
## 
## $criteria
##                    1             2             3             4             5
## AIC(n) -9.0365484711 -9.1071195087 -9.1798646083 -9.1360191536 -9.1535366677
## HQ(n)  -8.9658464499 -8.9892828069 -9.0148932257 -8.9239130902 -8.8942959235
## SC(n)  -8.8604470627 -8.8136171615 -8.7689613222 -8.6077149286 -8.5078315038
## FPE(n)  0.0001189886  0.0001109072  0.0001031804  0.0001079067  0.0001061898
##                    6             7             8             9            10
## AIC(n) -9.1580195328 -9.1184422562 -9.0776256472 -9.0763777547 -9.0403627987
## HQ(n)  -8.8516441079 -8.7649321505 -8.6769808607 -8.6285982875 -8.5454486508
## SC(n)  -8.3949134300 -8.2379352145 -8.0797176666 -7.9610688352 -7.8076529404
## FPE(n)  0.0001059413  0.0001105422  0.0001155937  0.0001163119  0.0001213244

Using VARselect, the function suggests we use 3 lags.

6 Analysis - Vector Autoregression Models

Taking the AIC recommendations, we fit our first VAR like so:

var1 <- VAR(df1, type="const", p=5)
summary(var1)

## 
## VAR Estimation Results:
## ========================= 
## Endogenous variables: housepolar, toponepercent 
## Deterministic variables: const 
## Sample size: 87 
## Log Likelihood: 143.849 
## Roots of the characteristic polynomial:
## 0.9779 0.9779 0.9226 0.8966 0.7618 0.7618 0.5784 0.5784 0.3712 0.3712
## Call:
## VAR(y = df1, p = 5, type = "const")
## 
## 
## Estimation results for equation housepolar: 
## =========================================== 
## housepolar = housepolar.l1 + toponepercent.l1 + housepolar.l2 + toponepercent.l2 + housepolar.l3 + toponepercent.l3 + housepolar.l4 + toponepercent.l4 + housepolar.l5 + toponepercent.l5 + const 
## 
##                   Estimate Std. Error t value Pr(>|t|)    
## housepolar.l1     1.021751   0.099455  10.273 5.05e-16 ***
## toponepercent.l1  0.003283   0.002631   1.248 0.215872    
## housepolar.l2     0.282914   0.139347   2.030 0.045826 *  
## toponepercent.l2 -0.008057   0.003701  -2.177 0.032566 *  
## housepolar.l3    -0.323298   0.137290  -2.355 0.021113 *  
## toponepercent.l3  0.009094   0.003760   2.419 0.017971 *  
## housepolar.l4     0.500006   0.137848   3.627 0.000516 ***
## toponepercent.l4 -0.005863   0.003700  -1.585 0.117218    
## housepolar.l5    -0.503742   0.105925  -4.756 9.20e-06 ***
## toponepercent.l5  0.001577   0.002390   0.660 0.511498    
## const             0.013285   0.008420   1.578 0.118770    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 0.01709 on 76 degrees of freedom
## Multiple R-Squared: 0.9907,  Adjusted R-squared: 0.9895 
## F-statistic: 809.4 on 10 and 76 DF,  p-value: < 2.2e-16 
## 
## 
## Estimation results for equation toponepercent: 
## ============================================== 
## toponepercent = housepolar.l1 + toponepercent.l1 + housepolar.l2 + toponepercent.l2 + housepolar.l3 + toponepercent.l3 + housepolar.l4 + toponepercent.l4 + housepolar.l5 + toponepercent.l5 + const 
## 
##                   Estimate Std. Error t value Pr(>|t|)    
## housepolar.l1    -1.583265   4.374335  -0.362   0.7184    
## toponepercent.l1  1.061939   0.115716   9.177 6.12e-14 ***
## housepolar.l2     0.502417   6.128889   0.082   0.9349    
## toponepercent.l2 -0.329066   0.162767  -2.022   0.0467 *  
## housepolar.l3     1.704360   6.038447   0.282   0.7785    
## toponepercent.l3  0.110972   0.165366   0.671   0.5042    
## housepolar.l4     3.439693   6.062969   0.567   0.5722    
## toponepercent.l4  0.001501   0.162728   0.009   0.9927    
## housepolar.l5    -0.936478   4.658901  -0.201   0.8412    
## toponepercent.l5  0.044934   0.105140   0.427   0.6703    
## const            -0.491778   0.370349  -1.328   0.1882    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 0.7515 on 76 degrees of freedom
## Multiple R-Squared: 0.9626,  Adjusted R-squared: 0.9577 
## F-statistic: 195.7 on 10 and 76 DF,  p-value: < 2.2e-16 
## 
## 
## 
## Covariance matrix of residuals:
##               housepolar toponepercent
## housepolar     0.0002919     0.0005681
## toponepercent  0.0005681     0.5647591
## 
## Correlation matrix of residuals:
##               housepolar toponepercent
## housepolar       1.00000       0.04424
## toponepercent    0.04424       1.00000

While we could comb through the results to inspect the sign and significance of our variables, what we really want from VAR is our ability to: (1) run Granger causality tests, and (2) run impulse response functions.

Since Kwon began his article questioning the direction of causality between income inequality and political polarization, we can examine that relationship closer with Granger-causality tests. Kwon notes that in the literature, the assumed argument is: Income Inequality $\rightarrow$ Political Polarization. Let’s find out from the data:

Political Polarization $\rightarrow$ Income Inequality

causality(var1, cause="housepolar")

## $Granger
## 
##  Granger causality H0: housepolar do not Granger-cause toponepercent
## 
## data:  VAR object var1
## F-Test = 2.902, df1 = 5, df2 = 152, p-value = 0.01565
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: housepolar and toponepercent
## 
## data:  VAR object var1
## Chi-squared = 0.16994, df = 1, p-value = 0.6802

I can reject the null that house polarization does not Granger-cause the income shares of the top 1% of earners in the United States from 1913-2008. We make this judgement by looking at $Granger and noticing that the p-value is less than our traditional critical value of 5%.

Income Inequality $\rightarrow$ Political Polarization

causality(var1, cause="toponepercent")

## $Granger
## 
##  Granger causality H0: toponepercent do not Granger-cause housepolar
## 
## data:  VAR object var1
## F-Test = 1.3616, df1 = 5, df2 = 152, p-value = 0.2419
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: toponepercent and housepolar
## 
## data:  VAR object var1
## Chi-squared = 0.16994, df = 1, p-value = 0.6802

I cannot reject the null that the income shares of the top 1% of earners does not Granger-cause house polarization. We make this judgement by looking at $Granger and noticing that the p-value is greater than our traditional critical value of 5%. Thus, housepolar Granger-causes toponepercent which means that the assumption that Income Inequality $\rightarrow$ Political Polarization should be revisited.

7 Analysis - Impulse Response Function

irf1 <- irf(var1, 
            n.ahead=30, # time of time periods (leads) ahead
            impulse="housepolar", # cause
            response="toponepercent", # effect
            boot = TRUE, # bootstrap
            runs = 1000) # number of runs

irf2 <- irf(var1, 
            n.ahead=30, 
            impulse="toponepercent", 
            response="housepolar",
            boot = TRUE,
            runs = 1000)

# prepare data for graphing
x_sequence <- seq(from=0, to=30) # create x-axis

irf_df1 <- data.frame(x1 = x_sequence, 
                      x2 = irf1$irf,
                      x3 = irf1$Lower,
                      x4 = irf1$Upper)

# rename columns since pulling from irf the is problematic for names
colnames(irf_df1) <- c('x', 
                       'irf', 
                       'irf_low', 
                       'irf_high') 


irf_df2 <- data.frame(x1 = x_sequence, 
                      x2 = irf2$irf,
                      x3 = irf2$Lower,
                      x4 = irf2$Upper)

colnames(irf_df2) <- c('x', 
                       'irf', 
                       'irf_low', 
                       'irf_high')

x <- list(title = "Year's Ahead", zeroline = FALSE)
y <- list(title = "Marginal Change in Y at time t")

plot_ly(data = irf_df1, type = 'scatter', mode = 'lines') %>%
  add_trace(x =~x, y =~irf_high, line = list(color = 'transparent'), 
            name = 'Upper CI', showlegend = FALSE) %>%
  add_trace(x =~x, y =~irf_low, fill = 'tonexty', fillcolor='rgba(0,100,80,0.2)', 
            line = list(color = 'transparent'), name = 'Lower CI', showlegend = FALSE) %>%
  add_trace(x =~x, y =~irf, line = list(color='rgb(0,100,80)'),
            name = 'Effect') %>%
  layout(xaxis = x,
         yaxis = y)

We interpret our results in the following way: a one-unit increase in house polarization this year is associated with a 1.1 percentage point increase on average in the share of national income controlled by the top 1% of incomes in the United States 25 years later. It takes about 9 years for the effect to be manifestly different from 0. The total effect lasts for at least 30 years.

x <- list(title = "Year's Ahead", zeroline = FALSE)
y <- list(title = "Marginal Change in Y at time t")

plot_ly(data = irf_df2, type = 'scatter', mode = 'lines') %>%
  add_trace(x =~x, y =~irf_high, line = list(color = 'transparent'), 
            name = 'Upper CI', showlegend = FALSE) %>%
  add_trace(x =~x, y =~irf_low, fill = 'tonexty', fillcolor='rgba(0,100,80,0.2)', 
            line = list(color = 'transparent'), name = 'Lower CI', showlegend = FALSE) %>%
  add_trace(x =~x, y =~irf, line = list(color='rgb(0,100,80)'),
            name = 'Effect') %>%
  layout(xaxis = x,
         yaxis = y)

In contrast, a 1 percentage point increase in the income share of the top 1% this year does not result in any significant change in house polarization at any point in time.

8 Sources

Kropko, Jonathan. PLAD 7500: Time Series. University of Virginia.
Kwon, Roy. “Does radical partisan politics affect national income distributions? Congressional polarization and income inequality in the United States, 1913–2008.” Social Science Quarterly 96, no. 1 (2015): 49-64.

Time Series Analysis

Andrew Fogarty

8/17/2019