9  Visualizing statistical models

A more accurate title, of course, would be “visualizing outputs from statistical models”.

We’ll be using this package to help us out:

Code
library(broom)

(And in the second section will use ggeffects to make our life even easier.)

9.1 Nationscape data

Code
# Read in Nationscape survey data:
library(tidyverse)
library(labelled)
a <- readRDS("data/nationscape/Nationscape_first10waves.rds")

9.1.1 Association between sexism and AOC favorability over time

We estimate 10 separate model, aka “rolling regressions”.

We run one model for each value of week, and create a tibble thanks to the tidy()function:

Code
a %>% filter(
         !is.na(aoc_Favorable),
         !is.na(gender_att3_by1SD)) %>%
  group_by(week) %>%
  do(broom::tidy(lm(aoc_Favorable ~
               gender_att3_by1SD +
               age + college_grad +
               White + Black + Hispanic, data = .))) %>%
    # Which coefficient we wish to pull:
    filter(term == "gender_att3_by1SD") %>%
    ggplot(aes(x=week,y=estimate,
             ymax = estimate + 1.96*std.error,
             ymin = estimate - 1.96*std.error)) +
  geom_pointrange(position = position_dodge(width = .45), size=.6) + labs(x="",y="Coefficient estimate",
  color="Subset of respondents") +
  ggtitle("Association between sexism and AOC favorability") +
  geom_hline(yintercept=0, linetype=2) 

It seems that the correlation between sexism and (lower) favorability of AOC is quite stable.

Code
aoc_Favorable <- a %>% filter(
         !is.na(aoc_Favorable),
         !is.na(gender_att3_by1SD)) %>%
  group_by(pid3) %>%
  do(broom::tidy(lm(aoc_Favorable ~
               gender_att3_by1SD +
               age + college_grad +
               White + Black + Hispanic, data = .)))

9.1.2 Demographics

We can now plot any set of coefficients, of course.

Let us try to see whether age correlates negatively with AOC’s favorability, as we might perhaps suspect:

Code
aoc_Favorable %>% filter(pid3 <= 3 & term == "age") %>%
  ggplot(aes(x=as_factor(pid3),y=estimate,
             ymax = estimate + 1.96*std.error,
             ymin = estimate - 1.96*std.error,
             shape = factor(pid3))) +
  geom_pointrange() + labs(x="",y="Coefficient estimate",color="")

Let’s improve the design a little bit:

Code
aoc_Favorable %>% filter(pid3 <= 3,term == "age") %>%
  ggplot(aes(x=as_factor(pid3),y=estimate,
             ymax = estimate + 1.96*std.error,
             ymin = estimate - 1.96*std.error,color=as_factor(pid3))) +
  geom_pointrange(position = position_dodge(width = .45), size=.6) + labs(x="Model estimate among respondents who identified as a...",y="Coefficient estimate",
  color="Subset of respondents") +
  ggtitle("Association between age and AOC favorability") +
  geom_hline(yintercept=0, linetype=2) 

Need to fix the party colors:

Code
aoc_Favorable %>% filter(pid3 <= 3,term == "age") %>%
  ggplot(aes(x=as_factor(pid3),y=estimate,
             ymax = estimate + 1.96*std.error,
             ymin = estimate - 1.96*std.error,color=as_factor(pid3))) +
  geom_pointrange(position = position_dodge(width = .45), size=.6) + labs(x="Model estimate among respondents who identified as a...",y="Coefficient estimate",
  color="Subset of respondents") +
  ggtitle("Association between age and AOC favorability") +
  geom_hline(yintercept=0, linetype=2) +
  scale_color_manual(values = c("blue3","darkred","grey40"))

9.2 ggeffects if your friend

9.2.1 AJPS (2021) data

Code
library(tidyverse)
library(haven)
library(labelled)

source("data/ajps2021/0_ajps_recode.R")

We now turn again to the dataset posted by Uscinski et al. (2021). We now turn again to the dataset posted by Uscinski et al. (2021).
The code below estimates four regression models to examine how party identification (PID) interacts with populism and conspiracy thinking in predicting attitudes about climate and Trump favorability.

First, we fit models using the original 7-point party ID scale:

Code
int1 <- lm(cexaggerate ~ pid*pop_Index, data=d3)
int2 <- lm(trumpft ~ pid*pop_Index, data=d3)
int3 <- lm(trumpft ~ pid*consp_Index, data=d3)
int4 <- lm(trumpft ~ pid*goodevil + consp_Index + pop_Index, data=d3)

We use ggeffects to plot predicted values from the first model, showing how the effect of populism on the belief that climate change is exaggerated varies across the 7-point party-ID scale:

Code
ggeffects::ggeffect(int1, terms=c("pop_Index","pid")) %>% plot() +
  labs(y="Climate change is exaggerated",
       color="PID (7=Strong Republican)",
       x="Populism",
       title = "Predicted beliefs (7-point PID scale)")

To make the visualization clearer, we can also recode the 7-point party ID scale into three categories:

  • Values 1-3: Democrats
  • Value 4: Independents
  • Values 5-7: Republicans
Code
# Recode PID into 3 categories
d3 <- d3 %>%
  mutate(pid3cat = case_when(
    pid >= 1 & pid <= 3 ~ "Democrat",
    pid == 4 ~ "Independent",
    pid >= 5 & pid <= 7 ~ "Republican"
  ) %>%
  factor(levels = c("Democrat", "Independent", "Republican")))

int1_3cat <- lm(cexaggerate ~ pid3cat*pop_Index, data=d3)
int2_3cat <- lm(trumpft ~ pid3cat*pop_Index, data=d3)
int3_3cat <- lm(trumpft ~ pid3cat*consp_Index, data=d3)
int4_3cat <- lm(trumpft ~ pid3cat*goodevil + consp_Index + pop_Index, data=d3)

Now we plot the same relationship using the 3-category party ID variable:

Code
ggeffects::ggeffect(int1_3cat, terms=c("pop_Index","pid3cat")) %>% plot() +
  labs(y="Climate change is exaggerated",
       color="Party ID",
       x="Populism",
       title = "Predicted beliefs (3-category PID)")

Customize the colors of the 3 lines: Democrats = blue, Independents = gray, Republicans = red

Code
ggeffects::ggeffect(int1_3cat, terms=c("pop_Index","pid3cat")) %>%
  plot() +
  scale_color_manual(values = c(
    "Democrat" = "#0072B2",     # blue
    "Independent" = "gray40",   # gray
    "Republican" = "#D55E00"    # red
  )) +
  labs(y="Climate change is exaggerated",
       color="Party ID",
       x="Populism",
       title = "Predicted beliefs (3-category PID, custom colors)")

To ensure that both the lines and the confidence interval ribbons convey party ID with the same custom colors, we should also specify scale_fill_manual() in addition to scale_color_manual(). The fill controls the color of the confidence interval bands, which default to a lighter version of the respective line color. By matching the color and fill, the plot becomes more consistent and clear.

Here’s how to do it:

Code
ggeffects::ggeffect(int1_3cat, terms=c("pop_Index","pid3cat")) %>%
  plot() +
  scale_color_manual(values = c(
    "Democrat" = "#0072B2",     # blue
    "Independent" = "gray40",   # gray
    "Republican" = "#FF0000"    # red
  )) +
  scale_fill_manual(values = c(
    "Democrat" = "#0072B2",     
    "Independent" = "gray70",   
    "Republican" = "#FF0000"
  )) +
  labs(y="Climate change is exaggerated",
       color="Party ID",
       fill="Party ID",
       x="Populism",
       title = "Predicted beliefs",
       subtitle = "3-category PID, custom colors including CI ribbons")