4 Standard charts

4.1 Who should hold more power?

When we map rows (individual observations) to numbers, a bar chart can be a good choice.

Consider this dataset, a simple table from this research paper.

library(tidyverse)

hib <- read_csv("https://raw.githubusercontent.com/zilinskyjan/datasets/master/politics/hibbing2023_who_should_govern.csv")

The survey data has already been aggregated, so you can think of these cell entries as toplines. The rows are actors (groups) and the columns are the percentage of respondents who think that the influence of each group should increase or decrease:

head(hib)

# A tibble: 6 × 5
  group                            influence_incr influence_decr political type 
  <chr>                                     <dbl>          <dbl>     <dbl> <chr>
1 People who experienced real pro…           59.9            3.5         0 Peop…
2 Politicians who can’t beneﬁt th…           49.4           11.6         0 Peop…
3 Scientists                                 48.3           10.4         0 Expe…
4 Medical doctors                            38              7.6         0 Expe…
5 State and local governments                31.9           13.3         1 Stan…
6 Unelected experts                          31             18.3         0 Expe…

Although we could simply run this code to create a bar chart, we ought to treat this as a draft zero:

hib %>%
 ggplot(aes(y=group,
            x=influence_incr)) +
 geom_col() +
 ggtitle("Whose influence should increase?")

Looks like I already added a type column:

hib %>% count(type)

# A tibble: 4 × 2
  type                    n
  <chr>               <int>
1 Experts                 4
2 People                  5
3 Specialized experts     4
4 Standard                5

I will now relabel one of the categories

hib$type[hib$type=="Standard"] <- "Establishment" 

unique(hib$type)

[1] "People"              "Experts"             "Establishment"      
[4] "Specialized experts"

I will also store some color codes, using the rcartocolor package:

H_pal <- rcartocolor::carto_pal(n = 4, name = "ag_Sunset")

And now I can make an enhanced version of the bar chart:

hib %>%
  ggplot(aes(y=fct_reorder(group,influence_incr),
             x=influence_incr,
             fill=type)) +
  geom_col() +
  scale_fill_manual(values=H_pal[c(4,1,2,3)]) +
  labs(x="Percent",y="",
       title="Whose political power should be increased?",
       caption = "Data: Hibbing et al. (2023)",
       fill="") + 
  theme_gray() +
  theme(text = element_text(size=13)) +
  theme(plot.caption.position = "plot",
        plot.title.position = "plot",
        plot.title = element_text(hjust = 0),
        axis.title.x = element_text(hjust = 1))

In Version 2, I first transformed the ordering of the bars by using fct_reorder(group, influence_incr). Rather than displaying the groups in their original factor order, we now sort them by the value of influence_incr, which makes the largest to smallest progression visually intuitive. This reordering helps readers immediately grasp which groups have relatively low or high suggested increases in influence.

Next, we introduced a new aesthetic mapping by filling the bars according to a type variable. Instead of the uniform color of the first version (draft zero), each bar now visually encodes its category, allowing the reader to distinguish subgroups at a glance. This addition only enhances the chart’s informational depth and also leverages color to communicate an extra dimension of our data.

To maintain control over the appearance of those colored bars, we have applied scale_fill_manual(values = H_pal[c(4,1,2,3)]). We customized the look of the plot by explicitly selecting colors from our H_pal palette in the order 4, 1, 2, 3. This manual scale replaces ggplot2’s default palette, giving the plot a custom look.

We also enriched the chart’s descriptive elements by refining titles, axis labels, and captions.

In contrast to the default styling, the second version adopts theme_gray() and globally increases text size to 13 points. These changes create a cleaner stylistic foundation and improve legibility, particularly for presentations or printed formats.

Lastly, we fine-tuned the alignment of both the title and caption by setting plot.title.position = "plot" and plot.caption.position = "plot", then aligning the title all the way to the left with hjust = 0. Similarly, right-aligning the x-axis title via axis.title.x = element_text(hjust = 1) anchors it neatly under the panel’s right edge. These precise adjustments guarantee that textual elements hug the plot margins consistently, resulting in a balanced and intentional layout.

4.2 Two Dimensions of American Politics

We’ll now make bar charts, scatterplots, stacked overlapping distributions (joyplots), and some other charts, using data on which the paper American Politics in Two Dimensions is based:

library(tidyverse)
library(haven)
library(labelled)

theme_set(theme_minimal())
theme_update(text = element_text(size=13),
             #text = element_text(family="Source Sans Pro")
)
             
# READ IN RECODED DATA
source("data_AJPS2021/0_ajps_recode.R")

As always, take a quick look at the structure of the datasets.

In this case, we have data from 3 surveys, which are stored in d1, d2, and d3.

head(d2)

# A tibble: 6 × 77
  caseid  female   edu black hispanic   age income pid      ideo interest attend
  <chr>    <dbl> <dbl> <dbl>    <dbl> <dbl>  <dbl> <dbl+l> <dbl>    <dbl>  <dbl>
1 R_2zZv…      0     5     0        0    30      4 1 [Str…     6        5      4
2 R_1oF5…      0     6     0        0    44      5 5 [Str…     3        4      2
3 R_3PoE…      1     6     0        0    28      2 3 [Ind…     2        2      3
4 R_dh73…      1     2     0        0    51      1 1 [Str…     1        5      4
5 R_2TTP…      1     3     1        0    29      1 1 [Str…     1        5      5
6 R_3ls3…      1     4     1        0    36      2 1 [Str…     4        4      4
# ℹ 66 more variables: youtube <dbl>, con1 <dbl+lbl>, con2 <dbl>, con3 <dbl>,
#   con4 <dbl+lbl>, goodevil <dbl+lbl>, pop1 <dbl+lbl>, pop2 <dbl+lbl>,
#   official <dbl+lbl>, cexaggerate <dbl>, climatechange <dbl>,
#   collusion <dbl+lbl>, trumpasset <dbl>, clintonnuke <dbl>, repsteal <dbl>,
#   birther <dbl>, trumpft <dbl>, bidenft <dbl>, qanonft <dbl>,
#   reppartyft <dbl>, dempartyft <dbl>, sandersft <dbl>, rep <dbl>, pid2 <dbl>,
#   ideo2 <dbl>, edu2 <dbl>, income2 <dbl>, youtube2 <dbl>, …

We’ll start by looking at the distribution of a binary (and moralizing) view of politics; how do people respond to the prompt “Politics is a battle between good and evil”?

d2 %>% 
  count(goodevil) %>%
  mutate(percent = n/sum(n)*100)

# A tibble: 5 × 3
  goodevil                            n percent
  <dbl+lbl>                       <int>   <dbl>
1 0 [Strongly disagree]             155    7.66
2 1 [Disagree]                      317   15.7 
3 2 [Neither agree, nor disagree]   572   28.3 
4 3 [Agree]                         600   29.7 
5 4 [Strongly agree]                379   18.7

Now let’s create a cross-tab, breaking down the responses by party ID:

A binary (and moralizing) view of politics

d2 %>% group_by(pid) %>%
  count(goodevil) %>%
  mutate(percent = n/sum(n)*100)

# A tibble: 25 × 4
# Groups:   pid [5]
   pid                 goodevil                            n percent
   <dbl+lbl>           <dbl+lbl>                       <int>   <dbl>
 1 1 [Strong Democrat] 0 [Strongly disagree]              33    6.10
 2 1 [Strong Democrat] 1 [Disagree]                       72   13.3 
 3 1 [Strong Democrat] 2 [Neither agree, nor disagree]   139   25.7 
 4 1 [Strong Democrat] 3 [Agree]                         162   29.9 
 5 1 [Strong Democrat] 4 [Strongly agree]                135   25.0 
 6 2 [Democrat]        0 [Strongly disagree]              16    5.44
 7 2 [Democrat]        1 [Disagree]                       62   21.1 
 8 2 [Democrat]        2 [Neither agree, nor disagree]    87   29.6 
 9 2 [Democrat]        3 [Agree]                          92   31.3 
10 2 [Democrat]        4 [Strongly agree]                 37   12.6 
# ℹ 15 more rows

Let’s plot the frequency of this Manichean perspective:

d2 %>% 
  count(goodevil) %>%
  mutate(percent = n/sum(n)*100) %>%
  ggplot(aes(y=as_factor(goodevil),x=percent)) +
  geom_bar(position="dodge", stat="identity") + xlab("Percent") + ylab(NULL) +
  ggtitle("Opinion: Politics is a battle between good and evil")

Would it be useful to plot bars in different colors? Potetnially, but not like this…

d2 %>% 
  count(goodevil) %>%
  mutate(percent = n/sum(n)*100) %>%
  ggplot(aes(y=as_factor(goodevil),x=percent,fill=as_factor(goodevil))) +
  geom_col()

Let’s:

Change/improve the colors
Clean up the labels as appropriate

This scale might work - here darker colors indicate greater disagreement:

d2 %>% 
  count(goodevil) %>%
  mutate(percent = n/sum(n)*100) %>%
  ggplot(aes(y=as_factor(goodevil),x=percent,fill=as_factor(goodevil))) +
  geom_col() +
  scale_fill_viridis_d() +
  labs(x="Proportion of respondents", y= "", fill = "",
       subtitle = "Opinion: Politics is a battle between good and evil", caption= "Data: Uscinski et al. (AJPS, 2021).")

4.3 Faceting by party ID

Let’s try to:

add facet_wrap(~pid)
which means we also have to use group_by(pid) before running count()

d2 %>% group_by(pid) %>%
  count(goodevil) %>%
  mutate(percent = n/sum(n)*100) %>%
  ggplot(aes(y=as_factor(goodevil),x=percent,fill=as_factor(goodevil))) +
  geom_col() + scale_fill_viridis_d() +
  labs(x="Proportion of respondents", y= "", fill = "", subtitle = "Opinion: Politics is a battle between good and evil", caption= "Data: Uscinski et al. (AJPS, 2021).") +
  facet_wrap(~pid)

Also, we really need to show the party labels, not their numbers.

Here as_factor(variable) will work as long as variable is indeed labelled:

d2 %>% group_by(pid) %>%
  count(goodevil) %>%
  mutate(percent = n/sum(n)*100) %>%
  ggplot(aes(y=as_factor(goodevil),x=percent,fill=as_factor(goodevil))) +
  geom_col() + scale_fill_viridis_d() +
  labs(x="Proportion of respondents", y= "", fill = "", subtitle = "Opinion: Politics is a battle between good and evil", caption= "Data: Uscinski et al. (AJPS, 2021).") +
  facet_wrap(~as_factor(pid)) +
    theme(text=element_text(size=9))

Note that we often want to increase the size of the text elements (theme(text=element_text(size=...))) but in this case I’m actually making the text smaller so that facet labels fit on the page.

Are stronger partisans are more Manichean on average?

d2 %>% group_by(pid) %>%
  count(goodevil) %>%
  mutate(percent = n/sum(n)*100) %>%
  ggplot(aes(y=as_factor(pid),x=percent,fill=as_factor(goodevil))) +
  geom_col() +
  scale_fill_viridis_d(alpha=.885) +
  labs(x="Proportion of respondents", y= "", fill = "",
      title = "Opinion: Politics is a battle between good and evil", caption= "Data: Uscinski et al. (AJPS, 2021).")

4.4 Showing more variables together

Let’s create several data objects: each of them will contain responses to the components of the conspiracy thinking scale:

pop2share <- d2 %>% 
              count(pop2) %>%
              mutate(percent = n/sum(n)*100) %>%
              mutate(categories = as_factor(pop2)) %>%
              mutate(q = "People who have studied for a long time\nand have many diplomas do not really know\nwhat makes the world go round.")

officialshare <- d2 %>% 
              count(official) %>%
              mutate(percent = n/sum(n)*100) %>%
              mutate(categories = as_factor(official)) %>%
              mutate(q = "Official government accounts of events\ncannot be trusted.")

con1share <- d2 %>% 
              count(con1) %>%
              mutate(percent = n/sum(n)*100) %>%
              mutate(categories = as_factor(con1)) %>%
              mutate(q = "Even though we live in a democracy,\na few people will always run things anyway")

con4share <- d2 %>% 
              count(con4) %>%
              mutate(percent = n/sum(n)*100) %>%
              mutate(categories = as_factor(con4)) %>%
              mutate(q = "Much of our lives are being controlled\nby plots hatched in secret places.")

Create one larger data objhect:

shareShow <-
  bind_rows(
    pop2share,
    officialshare,
    con1share,
    con4share
  ) %>%
  filter(!is.na(categories))

head(shareShow)

# A tibble: 6 × 8
  pop2                           n percent categories q     official con1  con4 
  <dbl+lbl>                  <int>   <dbl> <fct>      <chr> <dbl+lb> <dbl> <dbl>
1  1 [Strongly disagree]        72    3.56 Strongly … "Peo… NA       NA    NA   
2  2 [Disagree]                226   11.2  Disagree   "Peo… NA       NA    NA   
3  3 [Neither agree, nor di…   709   35.0  Neither a… "Peo… NA       NA    NA   
4  4 [Agree]                   633   31.3  Agree      "Peo… NA       NA    NA   
5  5 [Strongly agree]          383   18.9  Strongly … "Peo… NA       NA    NA   
6 NA                           107    5.29 Strongly … "Off…  1 [Str… NA    NA

Make a plot:

shareShow %>%
  ggplot(aes(y=as_factor(q),x=percent,fill=as_factor(categories))) +
  geom_bar(position="stack", stat="identity", width = .5) +
  jcolors::scale_fill_jcolors(palette = "pal4") +
  theme_minimal() + theme(text = element_text(size=15)) +
  labs(y = "",x = "Percent", fill = "",title = "Uscinski et al. (AJPS, 2021) survey data\non anti-establishment sentiment")

Look at the components of jcolors.

jcolors::display_all_jcolors()

4.5 Putting anti-establishment thinking on the 2nd axis

d1 %>% ggplot(aes(x = leftright2, y = suspicion2)) +
  geom_point() +
  labs(x = "Left vs. Right dimension\n(party ID, symbolic ideology, and party thermometer ratings)",
       y = "Anti-establishment thinking") + theme_gray()

d1 %>% ggplot(aes(x = leftright2, y = suspicion2)) +
  geom_point() +
  labs(x = "Left vs. Right dimension\n(party ID, symbolic ideology, and party thermometer ratings)",
       y = "Anti-establishment thinking") +
  theme_bw()

d1 %>% ggplot(aes(x = leftright2, y = suspicion2)) +
  geom_point() +
  labs(x = "Left vs. Right dimension\n(party ID, symbolic ideology, and party thermometer ratings)",
       y = "Anti-establishment thinking") +
  theme_minimal()

d1 %>% ggplot(aes(x = leftright2, y = suspicion2)) +
  geom_point(color = "purple", alpha=.55) +
  labs(x = "Left vs. Right dimension\n(party ID, symbolic ideology, and party thermometer ratings)",
       y = "Anti-establishment thinking", title = "Conspiracy, populist, and Manichean orientations\nare orthogonal to the standard partisan divide", caption = "Data: Uscinski et al. (AJPS, 2021).") +
  theme_minimal()

4.6 Conspiratorial orientation and PID

In principle, a box plot might make sense in this context…

d2 %>% 
  ggplot(aes(x=as_factor(pid), y=consp_Index)) +
  geom_boxplot(size=.4) +
  theme_minimal()

You could simultaneously display all respondets (jittered):

d2 %>% 
  ggplot(aes(x=as_factor(pid), y=consp_Index)) +
  geom_boxplot(size=.4) +
  geom_jitter(color="purple",alpha=.4,width = .1) +
  theme_minimal()

Make small edits:

d2 %>% 
  ggplot(aes(x=as_factor(pid), y=consp_Index)) +
  geom_boxplot(size=.4) + 
  geom_jitter(color="purple",alpha=.4,width = .1) + theme_minimal() +
  labs(y="Conspiratorial thinking\n(Average agreement with 4 questions)",x = "")

Perhaps even better:

d2 %>% 
  ggplot(aes(x=as_factor(pid), y=consp_Index)) +
  geom_boxplot(size=.4) + geom_jitter(color="purple",alpha=.4,width = .1) + theme_minimal() +
  labs(y="Conspiratorial thinking\n(Average agreement with 4 questions)",x = "",subtitle = "Conspiratorial thinking is uncorrelated with partisanship",
       caption = "Q1: Even though we live in a democracy, a few people will always run things anyway.
Q2: The people who really run the country, are not known to the voters.
Q3: Big events like wars, the recent recession, and the outcomes of elections are controlled\nby small groups of people who are working in secret against the rest of us.
Q4: Much of our lives are being controlled by plots hatched in secret places.")

4.7 All 3 components of the anti-est. orienation

d2 %>% 
  ggplot(aes(x=as_factor(pid), y=suspicion2)) +
  geom_boxplot(fill="purple", alpha=.2) +
  geom_jitter(alpha=.2,width = .11,color="purple4") +
  theme_minimal() +
  labs(y="Conspiratorial thinking + Populism + Manichean outlook",x = "")

4.8 Narcissism as a correlate for conspiracy thinking?

d1 %>% filter(!is.na(attent1)) %>%
  ggplot(aes(y = as_factor(attent1), x= suspicion2)) +
  geom_density_ridges()

d1 %>% filter(!is.na(attent1)) %>%
  ggplot(aes(y = as_factor(attent1), x= suspicion2)) +
  geom_density_ridges(
    fill = "#00AFBB",
    quantile_lines = TRUE, quantiles = 2,alpha = .9,color = "white") +
  labs(y= "I tend to want others to admire me",x="Conspiracy thinking")

d1 %>% filter(!is.na(attent1)) %>%
  ggplot(aes(y = as_factor(attent1), x= suspicion2)) +
  geom_density_ridges(
    fill = "#00AFBB",
    quantile_lines = TRUE, quantiles = 2,alpha = .9,color = "white") +
    xlim(0,1) + labs(y= "I tend to want others to admire me",x="Conspiracy thinking")

d1 %>% filter(!is.na(attent1)) %>%
  ggplot(aes(y = as_factor(attent1), x= suspicion2)) +
  geom_density_ridges(
    fill = "#00AFBB",
    quantile_lines = TRUE, quantiles = 2,alpha = .9,color = "white") +
  xlim(0,1) + labs(y= "I tend to want others to admire me",
       x = "The horizontal dimension measures how strongly respondents exhibit 3 traits:
1. Conspiratorial thinking (e.g. \"Our lives are controlled by secret plots\")
2. Populist beliefs
3. Manichean political views",caption = "Data: Uscinski et al. (AJPS, 2021).")

d1 %>%
  ggplot(aes(x=narcissism,y=suspicion2)) +
  geom_smooth() + labs(x = "Narcissism", y="Anti-establishment orientation")

d1 %>%
  ggplot(aes(x=narcissism,y=suspicion2)) +
  geom_smooth() + labs(x = "Narcissism", y="Anti-establishment orientation") +
  ggside::geom_xsidehistogram() +
  ggside::ggside(x.pos = "bottom")

4.9 Mainstream news

ggpubr::ggarrange(
  d1 %>%
  ggplot(aes(x=suspicion2,y=msm)) +
  geom_smooth() +
  labs(x = "Anti-establihment orientation", y="Much of the mainstream news is deliberately slanted to mislead us") +
  ggside::geom_xsidehistogram() +
  ggside::ggside(x.pos = "bottom") ,
  
d1 %>%
  ggplot(aes(x=narcissism,y=msm)) +
  geom_smooth() +
  labs(x = "Narcissism", y="Much of the mainstream news is deliberately slanted to mislead us") +
  ggside::geom_xsidehistogram() +
  ggside::ggside(x.pos = "bottom") 
)

collmod <- lm(msm ~ clintonft*suspicion2, data=d1)
ggeffects::ggeffect(collmod, terms=c("suspicion2","clintonft")) %>% plot() +
  labs(color="Rating of Hillary Clinton",y="Much of the mainstream news is deliberately slanted to mislead us",
       x="Anti-establishment orientation",title="")

“I often disagree with conventional views about the world”

ggpubr::ggarrange(
d1 %>%
  ggplot(aes(x=suspicion2,y=conwis)) +
  geom_smooth() +
  labs(x = "Anti-establihment orientation", y="I often disagree with conventional views about the world") +
  ylim(c(1,5)),
d1 %>%
  ggplot(aes(x=narcissism,y=conwis)) +
  geom_smooth() +
  labs(x = "Narcissism", y="I often disagree with conventional views about the world") +
  ylim(c(1,5))
)

4.10 Denial of climate change

table(d2$climatechange)


  1   2   3   4   5 
733 454 395 233 206

table(d2$climatechangeBIN)


   0    1 
1582  439

d2 %>% count(climatechangeBIN)

# A tibble: 3 × 2
  climatechangeBIN     n
             <dbl> <int>
1                0  1582
2                1   439
3               NA     2

Are the missing observations the same for the original and the recoded variable? (If not, we would want to check whether earlier code did something unintended.)

d2 %>% count(climatechangeBIN,climatechange)

# A tibble: 6 × 3
  climatechangeBIN climatechange     n
             <dbl>         <dbl> <int>
1                0             1   733
2                0             2   454
3                0             3   395
4                1             4   233
5                1             5   206
6               NA            NA     2