T-tests

Author

Steven

Published

September 23, 2025

Getting everything set up

setwd("~/Desktop/Coding-Boot-Camp/Data-viz-basics")

Set Working Directory

Change to your own working directory (WD) to save things like plots. You can do that by modifying the file path or go session (on the upper bar) –> set working directory). Working directories are important in R because they tell the computer where to look to grab information and save things like results. This can vary by project, script, etc. so it’s important to consistently have the appropriate WD. If you are unsure what your current WD is, you can use the getwd command in the console (usually the lower left hand pane) to get your WD.

Load Packages

if (!require("pacman")) install.packages("pacman") #run this if you don't have pacman 
library(pacman)
pacman::p_load(tidyverse, ggpubr, rstatix, caret, broom, kableExtra, reactable, Hmisc, datarium, car, DT,install = T) 
#use pacman to load packages quickly 

One of the great things about R is its ability to be super flexible. This comes from R’s ability to use different packages. You can load packages into your current work environment by using the library(PACKAGE) function. It is important to note that in order to library a package you must first have it installed. To install a package you can use the install.packages("PACKAGE") command. You can learn more about the different types of packages hosted on the Comprehensive R Archive Network (CRAN) here! One other important thing is that some packages often have similar commands (e.g., plyr and hmisc both use summarize) that are masked meaning that you will call a function and may not get the function you expect. To get around this you can use PACKAGE::FUNCTION to call package-specific function.

For this script, and here forward, We use pacman to load in all of our packages rather than using the iterative if (!require("PACKAGE")) install.packages("PACKAGE") set-up. There’s still some merit to using that if loading in packages in a certain order creates issues (e.g.,tidyverse and brms in a certain fashion).

Get our plot aesthetics set-up

This is a super quick and easy way to style our plots without introduce a vile amount of code lines to each chunk!

palette_map = c("#3B9AB2", "#EBCC2A", "#F21A00")
palette_condition = c("#ee9b00", "#bb3e03", "#005f73")

plot_aes = theme_classic() + # 
  theme(legend.position = "top",
        legend.text = element_text(size = 12),
        text = element_text(size = 16, family = "Futura Medium"),
        axis.text = element_text(color = "black"),
        axis.line = element_line(colour = "black"),
        axis.ticks.y = element_blank())

Build Relevant Functions

Using stuff like summary functions allows for us to present results in a clean, organized manner. For example, we can trim superfluous information from model output when sharing with collaborators among other things.

mystats <- function(x, na.omit=FALSE){
  if (na.omit)
    x <- x[!is.na(x)]
  m <- mean(x)
  n <- length(x)
  s <- sd(x)
  skew <- sum((x-m)^3/s^3)/n
  kurt <- sum((x-m)^4/s^4)/n - 3
  return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}

Load data

Since we are using an existing dataset in R, we don’t need to do anything fancy here. However, when normally load in data you can use a few different approaches. In most reproducible scripts you’ll see people use nomenclature similar to df, data, dataframe, etc. to denote a dataframe. If you are working with multiple datasets, it’s advisable to call stuff by a intuitive name that allows you to know what the data actually is. For example, if I am working with two different corpora (e.g., Atlantic and NYT Best-Sellers) I will probably call the Atlantic dataframe atlantic and the NYT Best-sellers NYT for simplicity and so I don’t accidentally write over files.

For example, if your WD is already set and the data exists within said directory you can use: df <- read_csv(MY_CSV.csv)

If the data is on something like Github you can use: df <- read_csv('https://raw.githubusercontent.com/scm1210/Language_Lab_Repro/main/Atlantic_Cleaned_all_vars.csv') #read in the data.

If you are working in one directory and need to call something for another directory you can do something like: Atlantic_FK <- read_csv("~/Desktop/working-with-lyle/Atlantic/Atlantic_flesch_kinkaid_scores.csv")

There are also other packages/functions that allow you to read in files with different extensions such as haven::read_sav() to read in a file from SPSS or rjson:: fromJSON(file="data.json")to read in a json file. If you want to learn more about how to reading in different files you can take a peek at this site.

# Load the data
data("genderweight", package = "datarium")
genderweight <- as.data.frame(genderweight)

# Show a sample of the data by group

Brief Description

Now that we got everything set up, we are going to get into our analyses. For the first part we are going to be looking at gender differences between males and females in weight using the genderweight dataset from the Datarium package in R! To this quantify this, we’ll use an Independent T-test. An Independent T-test is when we have one IV with only two levels or categories (e.g., gender: Male and Female) and one DV that is continuous (e.g., weight ranging from 0 to 250). If the IV has more than two levels we need to use another type of test called ANOVA; more on that later :).

Some other things:

  • Two means calculated from two samples Null hypothesis: Means from two samples are similar

  • Alternative hypothesis: Means from two samples are different

  • The bigger the observed difference between sample means = More likely sample means differ (capture this with effect size).

Statistical Assumptions

Assumption are also important. That is, data need to possess certain qualities for us to be able to use this type of test. For a t-test these are:

  • The DV data are continuous (not ordinal or nominal).

  • The sample data have been randomly sampled from a population.

  • There is homogeneity of variance (i.e., the variability of the data in each group is similar).

  • The distribution is approximately normal.

Click through the tabs to see how to check each assumption.

Continuous

We can check this by looking at the structure of our data using the class function (for one variable) or str function (for all the variables in our dataset). We can see that weight is numeric and therefore continuous! Therefore, we can move forward with our analyses.

class(genderweight$weight)
[1] "numeric"
str(genderweight)
'data.frame':   40 obs. of  3 variables:
 $ id    : Factor w/ 40 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ group : Factor w/ 2 levels "F","M": 1 1 1 1 1 1 1 1 1 1 ...
 $ weight: num  61.6 64.6 66.2 59.3 64.9 ...

Randomly Sampled

This is something you do when you design the study–we can’t do anything in R to check this.

Homogeneity of Variance

We need to make sure the variability of the data in each group is similar. We can use something called Levene’s Test for equality of error variances to do this. If we violate this assumption (p <. 05 in our test) we will have to use a Welch’s T-test. We violate this assumption so if we were actually doing a meaningful project we would need to use a different statistical test. For the sake of brevity we’ll pretend we are ok for now.

leveneTest(genderweight$weight ~ genderweight$group)
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)  
group  1    6.12  0.018 *
      38                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The distribution is approximately normal.

To check the distribution of the data we can use density plots in the ggplot within tidyverse to visualize this. It’s also important to get some statistics behind this, and to do that we can look at skewness and kurtosis via the mystats function that we wrote earlier. You can also use psych::describe to get similar information. For skewness and kurtosis, we want values of skewness fall between − 3 and + 3, and kurtosis is appropriate from a range of − 10 to + 10

# Basic density
p <- ggplot(genderweight, aes(x=weight)) + 
  geom_density(color="dodgerblue4", fill="dodgerblue3", alpha=0.2) + plot_aes +
  geom_vline(aes(xintercept=mean(weight)),
            color="dodgerblue3", linetype="dashed", size=1) 
annotate_figure(p,
                top = text_grob("Density Plots for both genders",  color = "black", face = "bold", size = 20),
                bottom = text_grob("Verical line represents mean value."
                                   , color = "Black",
                                   hjust = 1.1, x = 1, face = "italic", size = 12))

We can also have the densities by gender. Looks like we should have some interesting results!

p<-ggplot(genderweight, aes(x=weight, color=group, fill=group, alpha=0.1)) +
  geom_density()+geom_vline(aes(xintercept=mean(weight)),
            color="blue", linetype="dashed", size=1) + plot_aes 

annotate_figure(p,
                top = text_grob("Density Plots for both genders",  color = "black", face = "bold", size = 20),
                bottom = text_grob("Verical line represents mean value."
                                   , color = "Black",
                                   hjust = 1.1, x = 1, face = "italic", size = 12))

mystats(genderweight$weight)
       n     mean    stdev     skew kurtosis 
 40.0000  74.6624  11.7924   0.1486  -1.7419 

Summary Statistics

It’s also important for us to get some summary statistics for our data (e.g., N’s, Means, SDs).

Sample Size, Means, and Standard Deviations

genderweight %>%
  group_by(group) %>%
  get_summary_stats(weight, type = "mean_sd") %>% 
  reactable::reactable(striped = TRUE)

Build a simple graph to visualize

We also visualize our data with a box plot, while overlaying the scatter plots!

 # Create a box plot with jittered data points
ggplot(genderweight, aes(x = group, y = weight,color = group)) +
  geom_boxplot() +
  geom_jitter(width = 0.2, size = 2,alpha=0.2) +
  # Add axis labels
  xlab("Groups") +
  ylab("Weight") +
  plot_aes +
  # Add plot title
  ggtitle("Weight by Groups") + theme(plot.title = element_text(hjust = 0.5))

Main Analyses

Now, we’ll conduct our independent t-test to see if there are gender differences in weight, as well as getting the proportion of variance accounted for/Magnitude of the effect (Cohen’s D).

Independent T-test

We can see that we have a significant effect of gender on weight. That is, when conducting our independent t-test we observed a large effect [t(26.87) =-20.79, p < .001, d = -6.575], such that Men (M = 85.826, SD = 4.354) possessed significantly greater body weight compared to their female counterparts (M = 63.499, SD = 2.028).

stat.test <- genderweight %>% 
  t_test(weight ~ group) %>%
  add_significance()
stat.test
# A tibble: 1 × 9
  .y.    group1 group2    n1    n2 statistic    df        p p.signif
  <chr>  <chr>  <chr>  <int> <int>     <dbl> <dbl>    <dbl> <chr>   
1 weight F      M         20    20     -20.8  26.9 4.30e-18 ****    

Cohen’s D

From inspecting our output, we can see we have a large effect of gender

genderweight %>%  cohens_d(weight ~ group, var.equal = TRUE)
# A tibble: 1 × 7
  .y.    group1 group2 effsize    n1    n2 magnitude
* <chr>  <chr>  <chr>    <dbl> <int> <int> <ord>    
1 weight F      M        -6.57    20    20 large    

Visualize our results using GG-Plot with stats

plot <- ggplot(genderweight, aes(x = group, y = weight, color = group)) +
  geom_boxplot() +
  geom_jitter(width = 0.2, size = 2) +
  # Add axis labels
  xlab("Gender") +
  ylab("Weight") +
  plot_aes +
  # Add plot title
  ggtitle("Weight by Gender") + theme(plot.title = element_text(hjust = 0.5))

stat.test <- stat.test %>% add_xy_position(x = "group")
plot <- plot +  stat_pvalue_manual(stat.test, tip.length = 0) +
  labs(subtitle = get_test_label(stat.test, detailed = TRUE)) 

annotate_figure(plot,
                bottom = text_grob("D = -6.575"
                                   , color = "Black",
                                   hjust = 1.1, x = 1, face = "italic", size = 16))

High Level Example

Next we will see what this looks like at the highest level.

This part of the tutorial uses data and reproduces a subset of analyses reported in the following manuscript:

Mesquiti & Seraj. (Preprint) The Psychological Impacts of the COVID-19 Pandemic on Corporate Leadership

You can find this project’s github here.

Abstract

The COVID-19 pandemic sent shockwaves across the fabric of our society. Examining the impact of the pandemic on business leadership is particularly important to understanding how this event affected their decision-making. The present study documents the psychological effects of the COVID-19 pandemic on chief executive officers (CEOs). This was accomplished by analyzing CEOs’ language from quarterly earnings calls (N = 19,536) for a year before and after lockdown. CEOs had large shifts in language in the months immediately following the start of the pandemic lockdowns. Analytic thinking plummeted after the world went into lockdown, with CEOs’ language becoming less technical and more personal and intuitive. In parallel, CEOs’ language showed signs of increased cognitive load, as they were processing the effect of the pandemic on their business practices. Business leaders’ use of collective-focused language (we-usage) dropped substantially after the pandemic began, perhaps suggesting CEOs felt disconnected from their companies. Self-focused (I-usage) language increased, showing the increased preoccupation of business leaders. The size of the observed shifts in language during the pandemic also dwarfed responses to other events that occurred dating back to 2010, with the effect lasting around seven months.

Prep data

Load necessary packages and set Working Directory

if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse,zoo,lubridate,plotrix,ggpubr, caret, broom, kableExtra, reactable, effsize, install = T)

Define aesthetics

palette_map = c("#3B9AB2", "#EBCC2A", "#F21A00")
palette_condition = c("#ee9b00", "#bb3e03", "#005f73")

plot_aes = theme_classic() +
  theme(text = element_text(size = 16, family = "Futura Medium")) + 
  theme(axis.text.x=element_text(angle=45, hjust=1)) +
  theme(plot.title.position = 'plot', 
        plot.title = element_text(hjust = 0.5, face = "bold", size = 16)) + 
  theme(axis.text=element_text(size=16),
        axis.title=element_text(size=20,face="bold"))+
  theme(plot.title.position = 'plot', 
        plot.title = element_text(hjust = 0.5, face = "bold", size = 20)) +
  theme(axis.text=element_text(size = 14),
        axis.title=element_text(size = 20,face="bold"))

Write our Table Funcions

baseline_ttest <- function(ttest_list) {
  # Extract relevant information from each test and store in a data frame
  ttest_df <- data.frame(
    Group1 = seq(0,0,1),
    Group2 = seq(1,24,1),
    t = sapply(ttest_list, function(x) x$statistic),
    df = sapply(ttest_list, function(x) x$parameter),
    p_value = sapply(ttest_list, function(x) x$p.value)
  )
  
  # Format p-values as scientific notation
  ttest_df$p_value <- format(ttest_df$p_value, scientific = T)
  
  # Rename columns
  colnames(ttest_df) <- c("t", "t + 1 ", "t-value", "Degrees of Freedom", "p-value")
  
  # Create table using kableExtra
  kable(ttest_df, caption = "Summary of Welch's t-Tests", booktabs = TRUE) %>%
   kableExtra::kable_styling()
}

post_pandemic_summary <- function(ttest_list) {
  # Extract relevant information from each test and store in a data frame
  ttest_df <- data.frame(
    Group1 = seq(12,23,1),
    Group2 = seq(13,24,1),
    t = sapply(ttest_list, function(x) x$statistic),
    df = sapply(ttest_list, function(x) x$parameter),
    p_value = sapply(ttest_list, function(x) x$p.value)
  )
  
  # Format p-values as scientific notation
  ttest_df$p_value <- format(ttest_df$p_value, scientific = T)
  
  # Rename columns
  colnames(ttest_df) <- c("t", "t + 1 ", "t-value", "Degrees of Freedom", "p-value")
  
  # Create table using kableExtra
  kable(ttest_df, caption = "Summary of Welch's t-Tests", booktabs = TRUE) %>%
   kableExtra::kable_styling()
}



baseline_cohen_d <- function(cohen_d_list) {
  # Extract relevant information from each test and store in a data frame
  cohen_d_df <- data.frame(
    Group1 = seq(0,0,1),
    Group2 = seq(1,24,1),
    Cohen_d = sapply(cohen_d_list, function(x) x$estimate)
  )
  
  # Rename columns
  colnames(cohen_d_df) <- c("t", "t + 1", "Cohen's d")
  
  # Create table using kableExtra
  kable(cohen_d_df, caption = "Summary of Cohen's D", booktabs = TRUE) %>%
   kableExtra::kable_styling()
}

post_cohen_d <- function(cohen_d_list) {
  # Extract relevant information from each test and store in a data frame
  cohen_d_df <- data.frame(
    Group1 = seq(12,23,1),
    Group2 = seq(13,24,1),
    Cohen_d = sapply(cohen_d_list, function(x) x$estimate)
  )
  
  # Rename columns
  colnames(cohen_d_df) <- c("t", "t+1", "Cohen's d")
  
  # Create table using kableExtra
  kable(cohen_d_df, caption = "Summary of Cohen's D", booktabs = TRUE) %>%
   kableExtra::kable_styling()
}

baseline_mean_diff <- function(mean_diff_list) {
  # Extract relevant information from each mean difference calculation and store in a data frame
  mean_diff_df <- data.frame(
    Group1 = seq(0,0,1),
    Group2 = seq(1,24,1),
    mean_diff = mean_diff_list
  )
  
  # Rename columns
  colnames(mean_diff_df) <- c("t", "t+1", "Mean Difference")
  
  # Create table using kableExtra
  kable(mean_diff_df, caption = "Summary of Mean Differences", booktabs = TRUE) %>%
   kableExtra::kable_styling()
}


post_mean_diff <- function(mean_diff_list) {
  # Extract relevant information from each mean difference calculation and store in a data frame
  mean_diff_df <- data.frame(
    Group1 = seq(12,23,1),
    Group2 = seq(13,24,1),
    mean_diff = mean_diff_list
  )
  
  # Rename columns
  colnames(mean_diff_df) <- c("t", "t+1", "Mean Difference")
  
  # Create table using kableExtra
  kable(mean_diff_df, caption = "Summary of Mean Differences", booktabs = TRUE) %>%
   kableExtra::kable_styling()
}

Load in the Data

data  <-  read_csv("https://raw.githubusercontent.com/scm1210/Summer-Coding/main/data/Big_CEO.csv") #read in the data from github 

data <- data["2019-03-01"<= data$Date & data$Date <= "2021-04-01",] #subsetting covid dates 

data <- data %>% filter(WC<=5400) %>% #filter out based on our exclusion criteria
  filter(WC>=25)

data$month_year <- format(as.Date(data$Date), "%Y-%m") #reformat 

data_tidy <- data %>% dplyr::select(Date, Speaker, Analytic, cogproc,allnone,we,i,emo_anx) %>%
  mutate(Date = lubridate::ymd(Date),
         time_month = as.numeric(Date - ymd("2019-03-01")) / 30, #centering at start of march
         time_month_quad = time_month * time_month) #making our quadratic term

data_tidy$Date_off <- floor(data_tidy$time_month) #rounding off dates to whole months using ceiling function (0 = 2019-03, 24 = 2021-04)
data_tidy$Date_covid <- as.factor(data_tidy$Date_off) #factorize

Create Tidy Data for Graphs

df <- read_csv("https://raw.githubusercontent.com/scm1210/Language_Lab_Repro/main/Big_CEO.csv")#put code here to read in Big CEO data
df <- df %>% filter(WC<=5400)   %>% 
  filter(WC>=25)

df$month_year <- format(as.Date(df$Date), "%Y-%m") ###extracting month and year to build fiscal quarter graphs, need a new variable bc if not it'll give us issues

df2 <- df %>%#converting our dates to quarterly dates 
  group_by(month_year) %>% ###grouping by the Top100 tag and date 
  summarise_at(vars("Date","WC","Analytic","cogproc",'we','i'),  funs(mean, std.error),) #pulling the means and SEs for our variables of interest

df2 <- df2["2019-01"<= df2$month_year & df2$month_year <= "2021-03",] #covid dates 

Write our Stats Functions

We were interested in how language changed relative to baseline one year pre-pandemic, as well as how language changed after the Pandemic.

As a result we ran two separate set of analyses comparing t(time zero) to t[i] and t(12 months after our centered data point) to t + 1. The groups you see will be centered on 03/2019. That is, 12 = 03/2020, 13 = 04/2020, etc. etc.

Analytic Thinking

analytic_my.t = function(fac1, fac2){
  t.test(data_tidy$Analytic[data_tidy$Date_covid==fac1], 
         data_tidy$Analytic[data_tidy$Date_covid==fac2])
} #writing our t-test function to compare t to t[i] 

analytic_my.d = function(fac1, fac2){
  cohen.d(data_tidy$Analytic[data_tidy$Date_covid==fac1], 
          data_tidy$Analytic[data_tidy$Date_covid==fac2])
} #function for cohen's d

analytic_mean <-  function(fac1, fac2){
  mean(data_tidy$Analytic[data_tidy$Date_covid==fac1])- 
    mean(data_tidy$Analytic[data_tidy$Date_covid==fac2])
} #function to do mean differences

Cognitive Processing

cogproc_my.t = function(fac1, fac2){
  t.test(data_tidy$cogproc[data_tidy$Date_covid==fac1], 
         data_tidy$cogproc[data_tidy$Date_covid==fac2])
} #writing our t-test function to compare t to t[i] 


cogproc_my.d = function(fac1, fac2){
  cohen.d(data_tidy$cogproc[data_tidy$Date_covid==fac1], 
          data_tidy$cogproc[data_tidy$Date_covid==fac2])
} #function for cohen's d

cogproc_mean <-  function(fac1, fac2){
  mean(data_tidy$cogproc[data_tidy$Date_covid==fac1])- 
    mean(data_tidy$cogproc[data_tidy$Date_covid==fac2])
} #function to do mean differences

I-words

i_my.t = function(fac1, fac2){
  t.test(data_tidy$i[data_tidy$Date_covid==fac1], 
         data_tidy$i[data_tidy$Date_covid==fac2])
} #writing our t-test function to compare t to t + 1 

i_my.d = function(fac1, fac2){
  cohen.d(data_tidy$i[data_tidy$Date_covid==fac1], 
          data_tidy$i[data_tidy$Date_covid==fac2])
} #function for cohen's d


i_mean <-  function(fac1, fac2){
  mean(data_tidy$i[data_tidy$Date_covid==fac1])- 
    mean(data_tidy$i[data_tidy$Date_covid==fac2])
} #function to do mean differences

We-words

we_my.t = function(fac1, fac2){
  t.test(data_tidy$we[data_tidy$Date_covid==fac1], 
         data_tidy$we[data_tidy$Date_covid==fac2])
} 

we_my.d = function(fac1, fac2){
  cohen.d(data_tidy$we[data_tidy$Date_covid==fac1], 
          data_tidy$we[data_tidy$Date_covid==fac2])
} #function for cohen's d

we_mean <-  function(fac1, fac2){
  mean(data_tidy$we[data_tidy$Date_covid==fac1])- 
    mean(data_tidy$we[data_tidy$Date_covid==fac2])
} #function to do mean differences

Tidy data

Data transformations

  • None

Exclusions

  • Excluded texts that were shorter than ** 25 words ** and greater than ** 5,400 words **!

Summary of the Data

Range of Dates

range(data$Date)
[1] "2019-03-01" "2021-04-01"

Number of Speakers

speakers <- data %>%
  select(Speaker) %>%
  unique() %>%
  dplyr::summarize(n = n()) %>%
  DT::datatable()
speakers

Number of Transcripts

transcripts <- data %>%
  select(1) %>%
  dplyr::summarize(n = n()) %>%
  DT::datatable()

transcripts

Mean Word Count

word_count <- data %>%
  select(WC) %>%
  dplyr::summarize(mean = mean(WC)) %>%
  DT::datatable()
word_count

How did language change after the Pandemic?

Analytic Thinking

T-test

analytic_ttest<- mapply(analytic_my.t,seq(12,23,1), seq(13,24,1),SIMPLIFY=F) #compare t (first parantheses) to t[i] (second parentheses)increasing by 1
post_pandemic_summary(analytic_ttest)
Summary of Welch's t-Tests
t t + 1 t-value Degrees of Freedom p-value
12 13 5.0849 525.79 5.124e-07
13 14 -2.5948 373.06 9.839e-03
14 15 -1.6726 252.04 9.565e-02
15 16 1.9242 377.62 5.508e-02
16 17 -2.2122 200.57 2.808e-02
17 18 -1.6872 218.93 9.298e-02
18 19 0.6199 262.61 5.358e-01
19 20 0.8738 128.22 3.839e-01
20 21 -1.5398 230.76 1.250e-01
21 22 1.9533 94.32 5.374e-02
22 23 -1.1498 55.55 2.552e-01
23 24 -1.7179 2141.37 8.596e-02

Cohen’s D

analytic_d <- mapply(analytic_my.d,seq(12,23,1), seq(13,24,1),SIMPLIFY=FALSE) 
post_cohen_d(analytic_d)
Summary of Cohen's D
t t+1 Cohen's d
12 13 0.3275
13 14 -0.1598
14 15 -0.1320
15 16 0.1936
16 17 -0.1617
17 18 -0.1481
18 19 0.0710
19 20 0.0899
20 21 -0.1246
21 22 0.2682
22 23 -0.1598
23 24 -0.0739

Mean Differences

analytic_meandiff <- mapply(analytic_mean, seq(12,23,1), seq(13,24,1)) #across all of the months comparing to time zero
post_mean_diff(analytic_meandiff)
Summary of Mean Differences
t t+1 Mean Difference
12 13 4.7346
13 14 -2.1905
14 15 -1.8443
15 16 2.7483
16 17 -2.2318
17 18 -2.1013
18 19 1.1589
19 20 1.2765
20 21 -1.7791
21 22 4.0651
22 23 -2.0756
23 24 -0.9941

Cogproc

T-test

cogproc_ttest <-mapply(cogproc_my.t, seq(12,23,1), seq(13,24,1),SIMPLIFY=FALSE) #compare t (first parathese) to t[i] (second parantheses) increasing by 1
post_pandemic_summary(cogproc_ttest)
Summary of Welch's t-Tests
t t + 1 t-value Degrees of Freedom p-value
12 13 -4.3161 534.57 1.893e-05
13 14 1.4046 366.54 1.610e-01
14 15 4.0193 257.87 7.665e-05
15 16 -3.1317 367.30 1.877e-03
16 17 0.9868 199.24 3.249e-01
17 18 4.1804 223.61 4.178e-05
18 19 -1.1984 285.88 2.318e-01
19 20 -1.4930 133.62 1.378e-01
20 21 3.2109 234.85 1.508e-03
21 22 -1.7045 87.35 9.183e-02
22 23 0.9968 55.38 3.232e-01
23 24 -0.9994 2145.13 3.177e-01

Cohen’s D

cogproc_d <-mapply(cogproc_my.d, seq(12,23,1), seq(13,24,1),SIMPLIFY=FALSE)
post_cohen_d(cogproc_d)
Summary of Cohen's D
t t+1 Cohen's d
12 13 -0.2755
13 14 0.0887
14 15 0.3007
15 16 -0.3205
16 17 0.0733
17 18 0.3436
18 19 -0.1329
19 20 -0.1294
20 21 0.2477
21 22 -0.2453
22 23 0.1405
23 24 -0.0430

Mean Differences

cogproc_meandiff <- mapply(cogproc_mean, seq(12,23,1), seq(13,24,1)) # comparing time zero [3/2019]across all of the months
post_mean_diff(cogproc_meandiff)
Summary of Mean Differences
t t+1 Mean Difference
12 13 -0.6107
13 14 0.1785
14 15 0.6095
15 16 -0.6540
16 17 0.1560
17 18 0.7442
18 19 -0.2962
19 20 -0.2746
20 21 0.5305
21 22 -0.5358
22 23 0.2776
23 24 -0.0887

I-words

T-test

i_ttest <- mapply(i_my.t, seq(12,23,1), seq(13,24,1),SIMPLIFY=FALSE) #compare t (first paratheses) to t[i] (second parentheses) increasing by 1
post_pandemic_summary(i_ttest)
Summary of Welch's t-Tests
t t + 1 t-value Degrees of Freedom p-value
12 13 -5.1026 477.85 4.842e-07
13 14 2.9683 362.97 3.194e-03
14 15 2.7352 261.20 6.661e-03
15 16 -3.5895 336.98 3.805e-04
16 17 1.7614 191.52 7.976e-02
17 18 3.4394 240.73 6.870e-04
18 19 -2.6019 255.11 9.813e-03
19 20 0.4503 134.91 6.532e-01
20 21 1.5059 248.77 1.334e-01
21 22 2.0159 84.28 4.700e-02
22 23 -3.8068 57.56 3.437e-04
23 24 4.4095 2135.84 1.088e-05

Cohen’s D

i_d <- mapply(i_my.d,seq(12,23,1), seq(13,24,1),SIMPLIFY=FALSE)
post_cohen_d(i_d)
Summary of Cohen's D
t t+1 Cohen's d
12 13 -0.3468
13 14 0.1902
14 15 0.1991
15 16 -0.3758
16 17 0.1452
17 18 0.2370
18 19 -0.3007
19 20 0.0378
20 21 0.1020
21 22 0.2972
22 23 -0.4622
23 24 0.1900

Mean Differences

i_meandiff <- mapply(i_mean,seq(12,23,1), seq(13,24,1)) # comparing time zero [3/2020]across all of the months
post_mean_diff(i_meandiff)
Summary of Mean Differences
t t+1 Mean Difference
12 13 -0.2878
13 14 0.1551
14 15 0.1625
15 16 -0.3242
16 17 0.1289
17 18 0.2083
18 19 -0.2364
19 20 0.0329
20 21 0.0886
21 22 0.2293
22 23 -0.3912
23 24 0.1657

We-words

T-test

we_ttest <- mapply(we_my.t, seq(12,23,1), seq(13,24,1),SIMPLIFY=FALSE) #compare t (first parathese) to t[i] (second parantheses) increasing by 1
post_pandemic_summary(we_ttest)
Summary of Welch's t-Tests
t t + 1 t-value Degrees of Freedom p-value
12 13 4.1038 527.08 4.709e-05
13 14 0.9117 378.82 3.625e-01
14 15 -3.3226 253.14 1.023e-03
15 16 2.4647 373.96 1.416e-02
16 17 -0.3375 197.52 7.361e-01
17 18 -4.2759 229.50 2.794e-05
18 19 2.5510 262.60 1.131e-02
19 20 -0.1422 131.79 8.871e-01
20 21 -1.9395 238.21 5.362e-02
21 22 -0.2952 84.06 7.685e-01
22 23 0.8557 55.76 3.958e-01
23 24 -0.3495 2137.77 7.267e-01

Cohen’s D

we_d <- mapply(we_my.d, seq(12,23,1), seq(13,24,1),SIMPLIFY=FALSE)
post_cohen_d(we_d)
Summary of Cohen's D
t t+1 Cohen's d
12 13 0.2639
13 14 0.0550
14 15 -0.2595
15 16 0.2501
16 17 -0.0256
17 18 -0.3276
18 19 0.2920
19 20 -0.0130
20 21 -0.1444
21 22 -0.0436
22 23 0.1170
23 24 -0.0151

Mean Differences

we_meandiff <- mapply(we_mean, seq(12,23,1), seq(13,24,1)) # comparing time zero [3/2020]across all of the months
post_mean_diff(we_meandiff)
Summary of Mean Differences
t t+1 Mean Difference
12 13 0.3778
13 14 0.0763
14 15 -0.3676
15 16 0.3649
16 17 -0.0365
17 18 -0.4711
18 19 0.4169
19 20 -0.0183
20 21 -0.2042
21 22 -0.0609
22 23 0.1583
23 24 -0.0210

How did language change relative to baseline (one year before the pandemic; 03/2019)?

Analytic Thining

T-test

analytic_ttest_baseline <-mapply(analytic_my.t,0, seq(1,24,1),SIMPLIFY=FALSE) #compare t (first parantheses) to t[i] (second parentheses)increasing by 1
baseline_ttest(analytic_ttest_baseline)
Summary of Welch's t-Tests
t t + 1 t-value Degrees of Freedom p-value
0 1 1.5025 1161.5 1.332e-01
0 2 0.6860 1036.8 4.929e-01
0 3 0.2508 245.1 8.022e-01
0 4 2.6728 1120.1 7.631e-03
0 5 0.4785 1004.8 6.324e-01
0 6 1.0343 280.4 3.019e-01
0 7 2.6675 1049.9 7.760e-03
0 8 1.4046 993.4 1.605e-01
0 9 1.0147 328.1 3.110e-01
0 10 1.5505 286.2 1.221e-01
0 11 1.9738 1061.6 4.867e-02
0 12 1.3054 1272.1 1.920e-01
0 13 5.7770 623.9 1.201e-08
0 14 5.1516 929.5 3.153e-07
0 15 1.4219 370.2 1.559e-01
0 16 3.9258 316.9 1.061e-04
0 17 3.2572 918.1 1.166e-03
0 18 0.1171 302.2 9.068e-01
0 19 0.8463 164.4 3.986e-01
0 20 3.7364 920.4 1.981e-04
0 21 0.6393 331.8 5.231e-01
0 22 2.6168 63.2 1.109e-02
0 23 3.7687 1112.0 1.727e-04
0 24 2.4326 1125.2 1.515e-02

Cohen’s D

analytic_D_baseline <- mapply(analytic_my.d,0, seq(1,24,1),SIMPLIFY=FALSE) 
baseline_cohen_d(analytic_D_baseline)
Summary of Cohen's D
t t + 1 Cohen's d
0 1 0.0880
0 2 0.0330
0 3 0.0206
0 4 0.1587
0 5 0.0235
0 6 0.0867
0 7 0.1621
0 8 0.0687
0 9 0.0806
0 10 0.1283
0 11 0.1024
0 12 0.0694
0 13 0.3954
0 14 0.2534
0 15 0.1138
0 16 0.3057
0 17 0.1588
0 18 0.0102
0 19 0.0861
0 20 0.1803
0 21 0.0530
0 22 0.3237
0 23 0.2019
0 24 0.1263

Mean Differences

analytic_mean_baseline <- mapply(analytic_mean, 0, seq(1,24,1)) #across all of the months comparing to time zero
baseline_mean_diff(analytic_mean_baseline)
Summary of Mean Differences
t t+1 Mean Difference
0 1 1.3114
0 2 0.4935
0 3 0.3040
0 4 2.3251
0 5 0.3412
0 6 1.3028
0 7 2.3954
0 8 0.9976
0 9 1.1987
0 10 1.9189
0 11 1.4369
0 12 1.0438
0 13 5.7785
0 14 3.5880
0 15 1.7437
0 16 4.4920
0 17 2.2602
0 18 0.1590
0 19 1.3178
0 20 2.5943
0 21 0.8152
0 22 4.8803
0 23 2.8046
0 24 1.8106

Cogproc

T-test

cogproc_ttest_baseline <- mapply(cogproc_my.t, 0, seq(1,24,1),SIMPLIFY=FALSE) #compare t (first parathese) to t[i] (second parantheses) increasing by 1
baseline_ttest(cogproc_ttest_baseline)
Summary of Welch's t-Tests
t t + 1 t-value Degrees of Freedom p-value
0 1 -0.5097 1156.51 6.103e-01
0 2 -0.7179 1035.97 4.730e-01
0 3 -0.2391 218.72 8.112e-01
0 4 -1.8417 1119.70 6.579e-02
0 5 -0.3764 1051.94 7.067e-01
0 6 0.2442 282.79 8.072e-01
0 7 -1.7142 1029.21 8.680e-02
0 8 -0.9538 1076.64 3.404e-01
0 9 1.0446 320.31 2.970e-01
0 10 -0.8169 255.26 4.148e-01
0 11 -0.7245 1147.57 4.689e-01
0 12 -2.0280 1307.90 4.276e-02
0 13 -5.7012 609.25 1.855e-08
0 14 -6.5911 924.04 7.329e-11
0 15 -0.3856 395.99 7.000e-01
0 16 -4.0812 298.22 5.758e-05
0 17 -5.4650 949.00 5.916e-08
0 18 0.9265 310.67 3.549e-01
0 19 -0.5797 184.74 5.628e-01
0 20 -3.7994 936.81 1.544e-04
0 21 0.7639 341.61 4.455e-01
0 22 -1.3820 61.97 1.719e-01
0 23 -1.0691 1140.02 2.853e-01
0 24 -1.8593 1172.33 6.323e-02

Cohen’s D

cogproc_D_baseline <- mapply(cogproc_my.d, 0, seq(1,24,1),SIMPLIFY=FALSE)
baseline_cohen_d(cogproc_D_baseline)
Summary of Cohen's D
t t + 1 Cohen's d
0 1 -0.0299
0 2 -0.0345
0 3 -0.0213
0 4 -0.1094
0 5 -0.0180
0 6 0.0204
0 7 -0.1048
0 8 -0.0446
0 9 0.0841
0 10 -0.0732
0 11 -0.0364
0 12 -0.1070
0 13 -0.3939
0 14 -0.3256
0 15 -0.0298
0 16 -0.3292
0 17 -0.2601
0 18 0.0789
0 19 -0.0527
0 20 -0.1809
0 21 0.0622
0 22 -0.1778
0 23 -0.0568
0 24 -0.0951

Mean Differences

cogproc_mean_baseline <- mapply(cogproc_mean, 0, seq(1,24,1)) # comparing time zero [3/2020]across all of the months
baseline_mean_diff(cogproc_meandiff)
Summary of Mean Differences
t t+1 Mean Difference
0 1 -0.6107
0 2 0.1785
0 3 0.6095
0 4 -0.6540
0 5 0.1560
0 6 0.7442
0 7 -0.2962
0 8 -0.2746
0 9 0.5305
0 10 -0.5358
0 11 0.2776
0 12 -0.0887
0 13 -0.6107
0 14 0.1785
0 15 0.6095
0 16 -0.6540
0 17 0.1560
0 18 0.7442
0 19 -0.2962
0 20 -0.2746
0 21 0.5305
0 22 -0.5358
0 23 0.2776
0 24 -0.0887

I-words

T-test

i_ttest_baseline <- mapply(i_my.t, 0, seq(1,24,1),SIMPLIFY=FALSE) #compare t (first paratheseses) to t[i] (second parentheses) increasing by 1
baseline_ttest(i_ttest_baseline)
Summary of Welch's t-Tests
t t + 1 t-value Degrees of Freedom p-value
0 1 -3.3450 1143.82 8.495e-04
0 2 -1.1963 1155.18 2.318e-01
0 3 -0.1911 213.55 8.486e-01
0 4 -4.1439 1114.31 3.672e-05
0 5 -0.6477 1056.56 5.173e-01
0 6 -1.6111 278.03 1.083e-01
0 7 -3.3533 1035.23 8.274e-04
0 8 -2.0582 1066.96 3.981e-02
0 9 -1.4168 265.19 1.577e-01
0 10 -2.7747 284.30 5.891e-03
0 11 -1.9849 1154.30 4.739e-02
0 12 -0.3320 1263.50 7.399e-01
0 13 -5.0280 571.49 6.644e-07
0 14 -3.7093 958.88 2.198e-04
0 15 0.2214 390.58 8.249e-01
0 16 -3.9255 253.44 1.116e-04
0 17 -4.4733 1005.42 8.580e-06
0 18 0.4135 350.62 6.795e-01
0 19 -2.6460 180.60 8.864e-03
0 20 -4.3779 986.11 1.326e-05
0 21 -1.3222 371.13 1.869e-01
0 22 1.3508 63.34 1.816e-01
0 23 -5.6223 1250.84 2.322e-08
0 24 -1.8931 1254.80 5.858e-02

Cohen’s D

i_D_baseline <- mapply(i_my.d, 0, seq(1,24,1),SIMPLIFY=FALSE)
baseline_cohen_d(i_D_baseline)
Summary of Cohen's D
t t + 1 Cohen's d
0 1 -0.1966
0 2 -0.0544
0 3 -0.0174
0 4 -0.2467
0 5 -0.0310
0 6 -0.1358
0 7 -0.2047
0 8 -0.0967
0 9 -0.1296
0 10 -0.2305
0 11 -0.0996
0 12 -0.0177
0 13 -0.3562
0 14 -0.1786
0 15 0.0172
0 16 -0.3537
0 17 -0.2047
0 18 0.0327
0 19 -0.2453
0 20 -0.2011
0 21 -0.1028
0 22 0.1664
0 23 -0.2904
0 24 -0.0945

Mean Differences

i_mean_baseline <- mapply(i_mean, 0, seq(1,24,1)) # comparing time zero [3/2020]across all of the months
baseline_mean_diff(i_mean_baseline)
Summary of Mean Differences
t t+1 Mean Difference
0 1 -0.1748
0 2 -0.0504
0 3 -0.0149
0 4 -0.2082
0 5 -0.0266
0 6 -0.1159
0 7 -0.1744
0 8 -0.0846
0 9 -0.1162
0 10 -0.1958
0 11 -0.0843
0 12 -0.0150
0 13 -0.3028
0 14 -0.1477
0 15 0.0147
0 16 -0.3094
0 17 -0.1805
0 18 0.0278
0 19 -0.2086
0 20 -0.1757
0 21 -0.0871
0 22 0.1422
0 23 -0.2490
0 24 -0.0833

We-words

T-test

we_ttest_baseline <- mapply(we_my.t, 0, seq(1,24,1),SIMPLIFY=FALSE) #compare t (first parathese) to t[i] (second parantheses) increasing by 1
baseline_ttest(we_ttest_baseline)
Summary of Welch's t-Tests
t t + 1 t-value Degrees of Freedom p-value
0 1 0.5718 1161.88 5.676e-01
0 2 1.5919 1008.45 1.117e-01
0 3 -1.0685 214.75 2.865e-01
0 4 0.6154 1116.23 5.384e-01
0 5 0.9396 979.10 3.476e-01
0 6 -1.1796 280.32 2.392e-01
0 7 -0.2036 1067.88 8.387e-01
0 8 0.6497 972.54 5.160e-01
0 9 -0.6307 351.29 5.286e-01
0 10 -0.9677 309.04 3.340e-01
0 11 -0.9268 1073.79 3.543e-01
0 12 -0.4002 1197.17 6.891e-01
0 13 3.3604 676.59 8.220e-04
0 14 5.6601 890.34 2.040e-08
0 15 0.4232 395.82 6.724e-01
0 16 3.3898 317.82 7.876e-04
0 17 5.1356 889.20 3.457e-07
0 18 -0.7164 361.98 4.742e-01
0 19 2.3094 191.38 2.199e-02
0 20 4.1802 873.54 3.205e-05
0 21 0.8667 390.06 3.866e-01
0 22 0.2288 64.77 8.198e-01
0 23 2.5427 1081.13 1.114e-02
0 24 2.2873 1080.95 2.237e-02

Cohen’s D

we_D_baseline <- mapply(we_my.d, 0, seq(1,24,1),SIMPLIFY=FALSE)
baseline_cohen_d(we_D_baseline)
Summary of Cohen's D
t t + 1 Cohen's d
0 1 0.0334
0 2 0.0778
0 3 -0.0967
0 4 0.0362
0 5 0.0469
0 6 -0.0989
0 7 -0.0123
0 8 0.0322
0 9 -0.0483
0 10 -0.0764
0 11 -0.0479
0 12 -0.0216
0 13 0.2229
0 14 0.2874
0 15 0.0327
0 16 0.2636
0 17 0.2567
0 18 -0.0557
0 19 0.2040
0 20 0.2103
0 21 0.0657
0 22 0.0271
0 23 0.1374
0 24 0.1205

Mean Differences

we_mean_baseline <- mapply(we_mean, 0, seq(1,24,1)) # comparing time zero [3/2020]across all of the months
baseline_mean_diff(we_mean_baseline)
Summary of Mean Differences
t t+1 Mean Difference
0 1 0.0531
0 2 0.1227
0 3 -0.1575
0 4 0.0545
0 5 0.0718
0 6 -0.1605
0 7 -0.0191
0 8 0.0495
0 9 -0.0766
0 10 -0.1218
0 11 -0.0731
0 12 -0.0335
0 13 0.3443
0 14 0.4207
0 15 0.0531
0 16 0.4180
0 17 0.3815
0 18 -0.0896
0 19 0.3273
0 20 0.3090
0 21 0.1048
0 22 0.0439
0 23 0.2022
0 24 0.1813

Build our Graphs

Analytic Thinking

Analytic <- ggplot(data=df2, aes(x=Date_mean, y=Analytic_mean, group=1)) +
  geom_line(colour = "dodgerblue3") +
  scale_x_date(date_breaks = "3 month", date_labels = "%Y-%m") +
  geom_ribbon(aes(ymin=Analytic_mean-Analytic_std.error, ymax=Analytic_mean+Analytic_std.error), alpha=0.2) +
  ggtitle("Analytic Thinking") +
  labs(x = "Month", y = 'Standardized score') +
  plot_aes + #here's our plot aes object
  geom_vline(xintercept = as.numeric(as.Date("2020-03-01")), linetype = 1) +
  geom_rect(data = df2, #summer surge
            aes(xmin = as.Date("2020-06-15", "%Y-%m-%d"), 
                xmax = as.Date("2020-07-20",  "%Y-%m-%d"),
                ymin = -Inf, 
                ymax = Inf),
            fill = "gray", 
            alpha = 0.009) +
  geom_rect(data = df2, #winter surge
            aes(xmin = as.Date("2020-11-15", "%Y-%m-%d"), 
                xmax = as.Date("2021-01-01",  "%Y-%m-%d"),
                ymin = -Inf, 
                ymax = Inf),
            fill = "gray", 
            alpha = 0.009)
Analytic <- Analytic + annotate(geom="text",x=as.Date("2020-07-01"),
                                y=43,label="Summer 2020 surge", size = 3) + 
  annotate(geom="text",x=as.Date("2020-12-03"),
           y=43,label="Winter 2020 surge", size = 3)
Analytic

Cogproc

Cogproc <- ggplot(data=df2, aes(x=Date_mean, y=cogproc_mean, group=1)) +
  geom_line(colour = "dodgerblue3") +
  scale_x_date(date_breaks = "3 month", date_labels = "%Y-%m") +
  geom_ribbon(aes(ymin=cogproc_mean-cogproc_std.error, ymax=cogproc_mean+cogproc_std.error), alpha=0.2) +
  ggtitle("Cognitive Processing") +
  labs(x = "Month", y = '% Total Words') +
  plot_aes + #here's our plot aes object
  geom_vline(xintercept = as.numeric(as.Date("2020-03-01")), linetype = 1) +
  geom_rect(data = df2, #summer surge
            aes(xmin = as.Date("2020-06-15", "%Y-%m-%d"), 
                xmax = as.Date("2020-07-20",  "%Y-%m-%d"),
                ymin = -Inf, 
                ymax = Inf),
            fill = "gray", 
            alpha = 0.009) +
  geom_rect(data = df2, #winter surge
            aes(xmin = as.Date("2020-11-15", "%Y-%m-%d"), 
                xmax = as.Date("2021-01-01",  "%Y-%m-%d"),
                ymin = -Inf, 
                ymax = Inf),
            fill = "gray", 
            alpha = 0.009)
Cogproc <- Cogproc + annotate(geom="text",x=as.Date("2020-07-01"),
                                y=12.5,label="Summer 2020 surge", size = 3) + 
  annotate(geom="text",x=as.Date("2020-12-03"),
           y=12.5,label="Winter 2020 surge", size = 3)
Cogproc

I-words

i <- ggplot(data=df2, aes(x=Date_mean, y=i_mean, group=1)) +
  geom_line(colour = "dodgerblue3") +
  scale_x_date(date_breaks = "3 month", date_labels = "%Y-%m") +
  geom_ribbon(aes(ymin=i_mean-i_std.error, ymax=i_mean+i_std.error), alpha=0.2) +
  ggtitle("I-usage") +
  labs(x = "Month", y = '% Total Words') +
  plot_aes + #here's our plot aes object
  geom_vline(xintercept = as.numeric(as.Date("2020-03-01")), linetype = 1) +
  geom_rect(data = df2, #summer surge
            aes(xmin = as.Date("2020-06-15", "%Y-%m-%d"), 
                xmax = as.Date("2020-07-20",  "%Y-%m-%d"),
                ymin = -Inf, 
                ymax = Inf),
            fill = "gray", 
            alpha = 0.009) +
  geom_rect(data = df2, #winter surge
            aes(xmin = as.Date("2020-11-15", "%Y-%m-%d"), 
                xmax = as.Date("2021-01-01",  "%Y-%m-%d"),
                ymin = -Inf, 
                ymax = Inf),
            fill = "gray", 
            alpha = 0.009)
i <- i + annotate(geom="text",x=as.Date("2020-07-01"),
                                y=1.95,label="Summer 2020 surge", size = 3) + 
  annotate(geom="text",x=as.Date("2020-12-03"),
           y=1.95,label="Winter 2020 surge", size = 3)
i

We-words

we <- ggplot(data=df2, aes(x=Date_mean, y=we_mean, group=1)) +
  geom_line(colour = "dodgerblue3") +
  scale_x_date(date_breaks = "3 month", date_labels = "%Y-%m") +
  geom_ribbon(aes(ymin=we_mean-we_std.error, ymax=we_mean+we_std.error), alpha=0.2) +
  ggtitle("We-usage") +
  labs(x = "Month", y = '% Total Words') +
  plot_aes + #here's our plot aes object
  geom_vline(xintercept = as.numeric(as.Date("2020-03-01")), linetype = 1) +
  geom_rect(data = df2, #summer surge
            aes(xmin = as.Date("2020-06-15", "%Y-%m-%d"), 
                xmax = as.Date("2020-07-20",  "%Y-%m-%d"),
                ymin = -Inf, 
                ymax = Inf),
            fill = "gray", 
            alpha = 0.009) +
  geom_rect(data = df2, #winter surge
            aes(xmin = as.Date("2020-11-15", "%Y-%m-%d"), 
                xmax = as.Date("2021-01-01",  "%Y-%m-%d"),
                ymin = -Inf, 
                ymax = Inf),
            fill = "gray", 
            alpha = 0.009)
we <- we + annotate(geom="text",x=as.Date("2020-07-01"),
                                y=6.5,label="Summer 2020 surge", size = 3) + 
  annotate(geom="text",x=as.Date("2020-12-03"),
           y=6.5,label="Winter 2020 surge", size = 3)
we

Tie them all together

graphs <- ggpubr::ggarrange(Analytic,Cogproc,i,we,ncol=2, nrow=2, common.legend = TRUE, legend = "bottom")
annotate_figure(graphs,
                top = text_grob("CEOs' Language Change",  color = "black", face = "bold", size = 20),
                bottom = text_grob("Note. Vertical Line Represents the onset of the pandemic. \n\ Horizontal shading represents Standard Error. Vertical bars represent virus surges."
                                   , color = "Black",
                                   hjust = 1.1, x = 1, face = "italic", size = 16))

Package Citations

report::cite_packages()
  - Fox J, Weisberg S (2019). _An R Companion to Applied Regression_, Third edition. Sage, Thousand Oaks CA. <https://www.john-fox.ca/Companion/>.
  - Fox J, Weisberg S, Price B (2022). _carData: Companion to Applied Regression Data Sets_. doi:10.32614/CRAN.package.carData <https://doi.org/10.32614/CRAN.package.carData>, R package version 3.0-5, <https://CRAN.R-project.org/package=carData>.
  - Grolemund G, Wickham H (2011). "Dates and Times Made Easy with lubridate." _Journal of Statistical Software_, *40*(3), 1-25. <https://www.jstatsoft.org/v40/i03/>.
  - Harrell Jr F (2025). _Hmisc: Harrell Miscellaneous_. doi:10.32614/CRAN.package.Hmisc <https://doi.org/10.32614/CRAN.package.Hmisc>, R package version 5.2-3, <https://CRAN.R-project.org/package=Hmisc>.
  - J L (2006). "Plotrix: a package in the red light district of R." _R-News_, *6*(4), 8-12.
  - Kassambara A (2019). _datarium: Data Bank for Statistical Analysis and Visualization_. doi:10.32614/CRAN.package.datarium <https://doi.org/10.32614/CRAN.package.datarium>, R package version 0.1.0, <https://CRAN.R-project.org/package=datarium>.
  - Kassambara A (2023). _rstatix: Pipe-Friendly Framework for Basic Statistical Tests_. doi:10.32614/CRAN.package.rstatix <https://doi.org/10.32614/CRAN.package.rstatix>, R package version 0.7.2, <https://CRAN.R-project.org/package=rstatix>.
  - Kassambara A (2025). _ggpubr: 'ggplot2' Based Publication Ready Plots_. doi:10.32614/CRAN.package.ggpubr <https://doi.org/10.32614/CRAN.package.ggpubr>, R package version 0.6.1, <https://CRAN.R-project.org/package=ggpubr>.
  - Kuhn, Max (2008). "Building Predictive Models in R Using the caret Package." _Journal of Statistical Software_, *28*(5), 1–26. doi:10.18637/jss.v028.i05 <https://doi.org/10.18637/jss.v028.i05>, <https://www.jstatsoft.org/index.php/jss/article/view/v028i05>.
  - Lin G (2023). _reactable: Interactive Data Tables for R_. doi:10.32614/CRAN.package.reactable <https://doi.org/10.32614/CRAN.package.reactable>, R package version 0.4.4, <https://CRAN.R-project.org/package=reactable>.
  - Müller K, Wickham H (2025). _tibble: Simple Data Frames_. doi:10.32614/CRAN.package.tibble <https://doi.org/10.32614/CRAN.package.tibble>, R package version 3.3.0, <https://CRAN.R-project.org/package=tibble>.
  - R Core Team (2025). _R: A Language and Environment for Statistical Computing_. R Foundation for Statistical Computing, Vienna, Austria. <https://www.R-project.org/>.
  - Rinker TW, Kurkiewicz D (2018). _pacman: Package Management for R_. version 0.5.0, <http://github.com/trinker/pacman>.
  - Robinson D, Hayes A, Couch S (2025). _broom: Convert Statistical Objects into Tidy Tibbles_. doi:10.32614/CRAN.package.broom <https://doi.org/10.32614/CRAN.package.broom>, R package version 1.0.10, <https://CRAN.R-project.org/package=broom>.
  - Sarkar D (2008). _Lattice: Multivariate Data Visualization with R_. Springer, New York. ISBN 978-0-387-75968-5, <http://lmdvr.r-forge.r-project.org>.
  - Torchiano M (2020). _effsize: Efficient Effect Size Computation_. doi:10.5281/zenodo.1480624 <https://doi.org/10.5281/zenodo.1480624>, R package version 0.8.1, <https://CRAN.R-project.org/package=effsize>.
  - Wickham H (2016). _ggplot2: Elegant Graphics for Data Analysis_. Springer-Verlag New York. ISBN 978-3-319-24277-4, <https://ggplot2.tidyverse.org>.
  - Wickham H (2023). _forcats: Tools for Working with Categorical Variables (Factors)_. doi:10.32614/CRAN.package.forcats <https://doi.org/10.32614/CRAN.package.forcats>, R package version 1.0.0, <https://CRAN.R-project.org/package=forcats>.
  - Wickham H (2025). _stringr: Simple, Consistent Wrappers for Common String Operations_. doi:10.32614/CRAN.package.stringr <https://doi.org/10.32614/CRAN.package.stringr>, R package version 1.5.2, <https://CRAN.R-project.org/package=stringr>.
  - Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). "Welcome to the tidyverse." _Journal of Open Source Software_, *4*(43), 1686. doi:10.21105/joss.01686 <https://doi.org/10.21105/joss.01686>.
  - Wickham H, François R, Henry L, Müller K, Vaughan D (2023). _dplyr: A Grammar of Data Manipulation_. doi:10.32614/CRAN.package.dplyr <https://doi.org/10.32614/CRAN.package.dplyr>, R package version 1.1.4, <https://CRAN.R-project.org/package=dplyr>.
  - Wickham H, Henry L (2025). _purrr: Functional Programming Tools_. doi:10.32614/CRAN.package.purrr <https://doi.org/10.32614/CRAN.package.purrr>, R package version 1.1.0, <https://CRAN.R-project.org/package=purrr>.
  - Wickham H, Hester J, Bryan J (2024). _readr: Read Rectangular Text Data_. doi:10.32614/CRAN.package.readr <https://doi.org/10.32614/CRAN.package.readr>, R package version 2.1.5, <https://CRAN.R-project.org/package=readr>.
  - Wickham H, Vaughan D, Girlich M (2024). _tidyr: Tidy Messy Data_. doi:10.32614/CRAN.package.tidyr <https://doi.org/10.32614/CRAN.package.tidyr>, R package version 1.3.1, <https://CRAN.R-project.org/package=tidyr>.
  - Xie Y, Cheng J, Tan X, Aden-Buie G (2025). _DT: A Wrapper of the JavaScript Library 'DataTables'_. doi:10.32614/CRAN.package.DT <https://doi.org/10.32614/CRAN.package.DT>, R package version 0.34.0, <https://CRAN.R-project.org/package=DT>.
  - Zeileis A, Grothendieck G (2005). "zoo: S3 Infrastructure for Regular and Irregular Time Series." _Journal of Statistical Software_, *14*(6), 1-27. doi:10.18637/jss.v014.i06 <https://doi.org/10.18637/jss.v014.i06>.
  - Zhu H (2024). _kableExtra: Construct Complex Table with 'kable' and Pipe Syntax_. doi:10.32614/CRAN.package.kableExtra <https://doi.org/10.32614/CRAN.package.kableExtra>, R package version 1.4.0, <https://CRAN.R-project.org/package=kableExtra>.

All credit goes to the great Dani Cosme for teaching me how to make these! You can find her github here!