Introduction
Purpose
COVID-19 affected almost all areas of life. One aspect of life likely facing some of the most prominent long-lasting effects of the pandemic is education. It is well understood that COVID and the resulting upheaval of education practices led to a strong negative effect on standardized test scores, but there is less insight into how COVID affected existing disparities in education. Focusing on economic status and race/ethnicity, we seek to explore how existing gaps in performance on standardized tests changed as a result of COVID in the state of California.
Background and Data
Each year, the California Assessment of Student Performance and Progress (CAASPP) conducts the English Language Arts/Literacy and Mathematics standardized tests. These tests are administered to students in grades 3 - 8 and 11. California reports, by school, school district, and county, the percent of students who exceed, meet, nearly meet, and don’t meet expectations, as well as the number of students who took the test. Demographic information about students, including but not limited to race/ethnicity, socioeconomic status, and disability status, is also reported.
California did not require standardized testing in 2021, and did not conduct standardized testing in 2020, due to COVID-19. As a result, we will use data sets from 2019 and 2022 standardized test results in our below analysis.
Questions
- Did COVID increase the testing performance gap in CA counties between students who are economically disadvantaged and those who are not?
- Did COVID increase the testing performance gap in CA counties between students who are White, and those who are Hispanic or Latino?
Definitions
- We chose to use the percent of students who met or exceeded expectations on the standardized test as our performance success metric.
- A performance gap is the difference in the percent of students who met or exceeded expectations between the over performing and under performing group. In our specific case, this means subtracting performance of Economically Disadvantaged students from performance of Economically Advantaged students, and subtracting performance of Hispanic/Latino students from performance of White students. Another way to think of this is the “overperformance” of the advantaged group.
- A change in performance gap is the performance gap in 2019 subtracted from the performance gap in 2022. So, if the performance gap in 2019 was 25%, and the performance gap in 2022 was 30%, the change in performance gap would be 5%. This represents a widening of the performance gap from 2019 to 2022.
- We perform all analysis in this blog on aggregate data across all grade levels (3 - 8 and 11).
- We performed analysis only for counties which tested more than 500 students in 2019 and 2022. This led to the exclusion of two counties: Alpine and Sierra. All of the other 47 CA counties reported tests for more than 500 students in these two years.
Hypothesis
We predicted that the change in performance gap values would be positive and significant. That is, we predicted that the performance gaps would increase in size from pre-COVID tests (2019) to post-COVID tests (2022). We expected this to be the case both for race/ethnicity data and for socioeconomic status.
Economic Status
Overview - Performance Change
English
Math
For English, around 30-40% of economically disadvantaged students met or exceeded performance expectations, compared to 20-30% for math. On the other hand, 60-70% of students who were not economically disadvantaged met or exceeded expectations in English, which is also about 10% more than what was seen for math for that demographic. Critically, the performance decreases from 2019 to 2022 across English and Math tests for both demographic groups, but the decrease is larger for students who are not economically disadvantaged.
Pre-Covid (2019)
English
Math
Post-Covid (2022)
English
Math
Both 2019 and 2022 showed similar geographical patterns. For both test types, the performance gaps are the most drastic in urban areas, such as the counties in the Bay Area and the southern coast. Additionally, the performance gap for the English language tests appears to be more uniform throughout the state compared to the math performance gap. Also critical: each of the 47 analyzed counties show a positive and significant performance gap on economic status, for both years, and across both test types.
Change in Performance Gaps
English
Math
When comparing performance gaps from before and after COVID, we see that the size of the performance gap decreased over the course of COVID for most of CA across both subjects. The two most apparent exceptions to this generalization are Alameda and Sutter. These counties saw their performance gaps grow by more than two percentage points for both subjects.
Race/Ethnicity
Overall 2022 Performance
English
Math
The two largest Race/Ethnicity groups, by population, that are represented in this data set, are White and Hispanic or Latino students. We chose to analyze the difference between these two specific groups for our Race/Ethnicity analysis component.
Pre-Covid (2019)
English
Math
Post-Covid (2022)
English
Math
The geographic distribution of size of the performance gap between white and Hispanic or Latino students is fairly consistent across both subjects and years. In all instances, the largest performance gaps are located in counties that are along the coast south of Sonoma, as well as Mono and Inyo counties along CA’s eastern border.
Change in Performance Gaps
English
Math
Unlike the changes in performance gaps for economic status, the change in these gaps between white and Hispanic or Latino students are much more varied throughout the state and do not appear to follow a geographic pattern for either English or Math. The only major discernible pattern is that it appears, in general, that the counties surrounding urban areas saw their performance gaps grow. Additionally, we noted that Trinity County and Inyo County marked the two extremes, where Trinity saw the performance gap increase by at least five percentage points for both subjects, whereas Inyo county saw their gap decrease by at least the same amount, again for both subjects.
Demographic Breakdown
Since the gaps are fairly widely distributed, we wanted to examine the distribution compared to the share of students in each county that are identified as Hispanic/Latino. We present a geographic representation by county, for 2019 and 2022, of this proportion.
2019 - English
2022 - English
Since the proportion of students who identified as Hispanic or Latino is nearly identical in each county for both English language tests and Math tests, we only chose to display the data corresponding to the English test. The distribution was also very similar from 2019 to 2022. We see that for both years, the proportion of Hispanic or Latino students increases from north to south within CA, ranging from as low as 13% in Trinity County all the way up to 96% in Imperial County.
Performance Gap vs Demographic Composition
Lastly, we plotted the Change in Performance Gap from 2019 to 2022 against the percent of students who were Hispanic/Latino in that county in 2022.
English
Math
The scatter plots demonstrate that the proportion of students identifying as Hispanic or Latino is largely uncorrelated to the change in performance gap over COVID for both the English language and Math tests. Thus, racial/ethnic makeup of a county is unable to predict the change in performance gap experienced across Hispanic/Latino students.
Limitations
Over the course of our research, we encountered a few limiting factors:
- The lack of or limits on testing data from 2020 and 2021 prevented us from using these years in our analysis, so we could not examine the immediate effects of COVID.
- CA is a heavily populated state, and therefore almost every result is significant due to the large sample size, so we chose not to focus on statistical significance.
- The original color scales available for the maps were not able to differentiate performance gaps near 0, so we had to adjust the ‘stops’ along the scale for better visualization.
- While there was a mountain of data available to us, we were forced to limit the scope of our investigation in order to be able to fully answer our questions of interest.
Conclusions
Economic Status
| English | Math | |
|---|---|---|
| 2019 Gap | 30.50 | 31.40 |
| 2022 Gap | 29.43 | 30.22 |
| Net Change in Gap | -1.07 | -1.18 |
Statewide, the performance gap between students who are economically disadvantaged and those that are not decreased by about one percentage point from 2019, pre-COVID, to 2022, post-COVID, for both English language tests and Math tests. Additionally, it appears that the performance gap for the Math tests is about one percentage point larger than gap for the English language test across both years.
Since this performance gap decreased over the course of COVID for both subjects (as seen by a negative net change and across the state in the tables above), our hypothesis was wrong when considering the demographics of economic status. One potential explanation for this could be that the additional educational resources that typically serve to benefit students who are not economically disadvantaged were no longer or less accessible compared to before COVID.
Race/Ethnicity
| English | Math | |
|---|---|---|
| 2019 Gap | 16.73 | 21.18 |
| 2022 Gap | 19.20 | 23.79 |
| Net Change in Gap | 2.47 | 2.61 |
Looking at statewide date for comparing White students and Hispanic or Latino students, we can see that the size of the performance gap on standardized tests increased by about 2.5 percentage points for both subjects from pre-COVID to post-COVID. Again, similar to the trend seen with economic status, the performance gaps for the English language tests were smaller than those for the Math tests. However, when looking at race/ethnicity, the English performance gaps were a full 4.5 percentage points smaller than the Math performance gaps.
Since the performance gaps for both subjects saw an overall increase over the course of COVID (as see by a positive net change), our initial hypothesis was correct in predicting that the performance gap would grow. Akin to the reasoning behind our hypothesis, for certain demographics, the factors that led to disparities in educational performance were only exacerbated due to the limitations of COVID.
So what?
While our results were divided, the confirmation of our hypothesis for race/ethnicity that the performance gap on standardized tests would increase from 2019, pre-COVID, to 2022, post-COVID, is enough evidence to suggest that there needs to be more research into other demographic markers in order to determine the groups most seriously academically devastated over the course of COVID.
Educational institutions are already working hard to mitigate the widespread negative impact of COVID on learning and testing performance. However, if the demographics that were more severely affected by COVID protocols do not receive additional attention, previous performance gaps are likely to continue to grow, leaving behind an entire generation of children.
References
Code for Choropleth Visualizations
We used the highcharter package to make our map visualizations. We needed to use some custom programming to get our preferred diverging and sequential color schemes. For the diverging scheme, we use a non-linear scale of representations for each color stop.
# Create vectors for the diverging and sequential color schemes
diverging_colors <- c("#b2182b",
"#d6604d",
"#f4a582",
"#fddbc7",
"#f7f7f7",
"#d1e5f0",
"#92c5de",
"#4393c3",
"#2166ac") %>%
rev()
sequential_colors <- c("#67001f",
"#b2182b",
"#d6604d",
"#f4a582",
"#fddbc7",
"#f7f7f7") %>%
rev()
# Create tibble objects, and then lists, for the diverging and sequential schemes
diverging_stops <- tibble(
# A non-linear distribution of where the schemes should occur
q = c(0, .22, .38, .46, .5, .54, .62, .78, 1),
c = diverging_colors,
stringsAsFactors = FALSE
)
# Parse to list
diverging_stops <- list_parse2(diverging_stops)
n <- length(sequential_colors) - 1
sequential_stops <- tibble(
# A linear distribution of where the schemes should occur
q = 0:n / n,
c = sequential_colors,
stringsAsFactors = FALSE
)
# Parse to list
sequential_stops <- list_parse2(sequential_stops)
# Create standard function to simplify map creation process
# Note: `df` is a dataframe which requires a column with the metric
# to graph, as well as a column called `test_name` that can be filtered
# based on the `subject` argument.
plot_map <- function(df, metric, subject, name, title = NULL) {
# Only select one subject from data frame
df_modified <- filter(df, test_name == subject)
# Calculate the min and max values; determine sequential or diverging color stops
min_val <- min(df %>% pull(metric))
max_val <- max(df %>% pull(metric))
manual_scale_min <- NULL
manual_scale_max <- NULL
if(min_val < 0) {
stops <- diverging_stops
# Manually set the max and min for the color scale by finding the value
# of the greatest magnitude and rounding up. This way, 0 is the center
# of each distribution.
highest_abs <- max(abs(c(min_val, max_val))) %>%
plyr::round_any(2.5, ceiling)
manual_scale_min <- highest_abs * -1
manual_scale_max <- highest_abs
} else {
stops <- sequential_stops
# Set the min and max to be standard across both subjects in a dataframe
# since the `min_val` and `max_val` calculation are based on `df`,
# not `df_modified`
manual_scale_min <- plyr::round_any(min_val, 2.5, ceiling)
manual_scale_max <- plyr::round_any(max_val, 2.5, ceiling)
}
# Output the map; return this object
hcmap(
"countries/us/us-ca-all",
data = df_modified,
value = metric,
joinBy = c("name", "county_name"),
name = name,
dataLabels = list(enabled = TRUE, format = "{point.name}"),
borderColor = "#1a1a1a",
borderWidth = 0.1,
# Formatting the tooltip as percentages
tooltip = list(
valueDecimals = 2,
valueSuffix = "%"
)
) %>%
# Include the pre-selected color scale
hc_colorAxis(stops = stops,
min = manual_scale_min,
max = manual_scale_max) %>%
# Add optional title
hc_title(text = title)
}
Data
California Assessment of Student Performance and Progress (n.d), “English Language Arts/Literacy and Mathematics: Smarter Balance Summative Assessments,” available at https://caaspp-elpac.ets.org/caaspp.