Students don’t really need to care about benchmarks since college admissions officers don’t look at them at all.

Each fall the College Board releases its national and state SAT Reports on College & Career Readiness and ACT releases its Condition of College and Career Readiness report. And each year, they are met with exaggeration, hand-wringing, and misinterpretation by reporters forced to cover the “Sky is Falling” beat. Consider this headline that appeared in The Connecticut Mirror this past summer: “SAT shows large numbers of juniors unready for college or jobs.” The occasion for this claim was that 1/3 of Connecticut juniors failed to meet the College and Career Readiness reading and writing benchmarks established by the College Board. Could that be true?

Let’s get this out of the way. Benchmarks tell no one whether they should go to college or apply for a job. Thankfully, the state of Connecticut has a more sophisticated understanding of how to use benchmarks. “A benchmark score,” the Connecticut State Department of Education explains, “is one point on [a] scale. Two students with marginally different scale scores may be placed on either side of a benchmark. The [Department] views these new SAT benchmark scores as a useful but preliminary measure.”

But what is a benchmark? And should anyone care about them?

Here’s the short, pragmatic answer: students don’t really need to care about benchmarks since college admissions officers don’t look at them at all. But you’re not here for quick and practical, right? You’re here for a deep dive into the history and meaning of benchmarks. Let’s go!

Where Did College and Career Readiness Benchmarks Come From?

Our story begins in 2001, when a group of governors and other movers and shakers in education and business (including then-president of The College Board, Gaston Caperton) met for a National Education Summit. Their focus was “helping states address two key challenges: increasing the capacity of teachers and schools to meet higher standards and expanding testing and accountability systems to provide better data and stronger incentives for high student achievement.” From this meeting, about a decade later, the Common Core State Standards would be born. The meeting also raised the need for, you guessed it, benchmarks.

In 2004, the National Education Summit led to the release of Ready or Not: Creating a High School Diploma That Counts by the American Diploma Project. It introduced the term “college and workplace readiness” and argued that readiness needs to be gauged against standards based on the “real world.” The use of work, rather than career, is notable here, since the College and Career Readiness benchmarks are often misunderstood. They bear no connection to a person’s readiness to get a job after college. They are meant to reflect the readiness of a student to enter college or the workplace after high school. The Ready or Not report also made a recommendation that must have struck fear into the ACT and the College Board.

Use high school assessments for college admissions and placement. Little justification exists for maintaining completely separate standards and testing systems for high school graduation on the one hand and college admissions and placement on the other (15).

This recommendation–using state assessments, like the MCAS in Massachusetts, rather than the SAT or ACT for college admissions–could have been the end of the traditional admissions tests. What happened, instead, is that first ACT and then College Board started campaigning to get states to use their tests as assessments. In doing so, they took on the language of state assessments, including benchmarks.

In 2005,  ACT created its “College Readiness Benchmarks.” By 2010, ACT had shifted the branding to “College and Career Benchmarks.” In 2013, ACT published a white paper to show the alignment of ACT to Common Core State Standards. This is how ACT defines its benchmarks in that paper:

The ACT College Readiness Benchmarks are the minimum scores required on each subject test on the ACT (English, mathematics, reading, and science) for students to have a high probability of success in credit-bearing, entry-level college courses in that subject area. ACT has set Benchmarks for the most commonly taken entry-level college courses (English Composition, College Algebra, introductory social science courses, and Biology) and for other courses (such as Calculus and Chemistry). Students who meet a Benchmark on the ACT have approximately a 50 percent likelihood of earning a B or better, and approximately a 75 percent likelihood of earning a C or better, in the corresponding college course or course area, without remediation. The Benchmarks give students, families, and educators useful information for assessing whether a student has mastered the skills they need to succeed in postsecondary education (4).

Playing catch-up in 2011, College Board issued its first College Readiness Benchmarks (again, branding them “college readiness,” with no attempt to connect them to “career” or work). However, soon after the initial launch of the benchmarks, College Board had shifted from the label “College Readiness Benchmarks” to “College and Career Readiness Benchmarks,” clearly echoing the language of the Partnership for Assessment of Readiness for College and Careers, which was created in 2010 and developed the PARCC exam, one of the two main assessments developed around the Common Core State Standards.

In the early days of the benchmarks, The College Board provided good, clear guidance on how to use the benchmarks, explicitly stating that teachers, families, students, and colleges shouldn’t be concerned with the benchmarks.

Building on the strengths of the SAT, best known as the nation’s leading college admission test, the new SAT Benchmark is designed exclusively for secondary school educators, administrators and policymakers working to prepare students for future learning and career opportunities. (College Board, 2011).

To make the point clearer, College Board provided this chart.

cb-benchmarks-2011.png
College Board, 2011

This guidance has been abandoned.  The current guide to benchmarks advises the following:

The benchmarks are intended to be used by policymakers, administrators, educators, and parents to monitor the academic progress of a student or groups of students as they prepare for college and careers.  (College Board, 2017)

The College Board does not explain why it changed its guidance, and it is possible to see this change as salutary.  Isn’t more information better for everyone?  Note, however, that College Board leaves students off the list of who should use the benchmarks–even though it puts the benchmarks on their score reports.  We will return to the issue of who should use the benchmarks and how they should, but first let’s dig deeper into what the benchmarks actually indicate.

What Do the Benchmarks Indicate?

The problem with tying “college and career readiness” to a test is that you need to fit a somewhat abstract concept (“readiness”) into a quantitative model, while keeping real-world applicability in mind. The ACT and College Board had to figure out how a score could indicate readiness for a broad range of colleges and careers. The exams are pretty good at telling you who can can solve single-variable equations or properly use a semicolon, but how do those skills play out in college or the workplace?

The answer ACT and College Board came up with was to define college and career readiness in terms they were familiar with: the correlation between test scores and first-year GPA. In the same way that a hammer sees everything as a nail, the College Board and ACT see everything as a score. According to their worldview, if you can’t graph it, it ain’t real. So, ACT and SAT looked at what scores predicted “success.”

ACT benchamrks math graph
ACT, 2016

From the start, ACT has defined success as having a 75% probability of earning a C or better in the corresponding college class. The College Board initially defined success as a 65% probability of earning an overall first-year college GPA of B– or higher, but in 2015 revised that definition to match the definition used by ACT.

We should not let these changing definitions of “success” pass by lightly. They point to the somewhat arbitrary nature of the definition of success. Would a C in your freshman math class count as success for you? 75% is a nice round number, but what is the correlation between a 75% chance of earning a C in freshman math and of graduating on time?

The 75% probability is one point on that continuum. Had ACT and CB chosen to, they could have set the benchmark at a score that predicts a 13% chance of earning an A or a 32% probability of earning a C in Algebra (that number is an 11, which is about what you’d get by guessing randomly on the entire ACT Math). Treating benchmarks as cutoffs, saying someone is or is not ready for college or work, as The Connecticut Mirror did, is especially troubling, given the somewhat arbitrary nature of how benchmarks define success.

The fact that the SAT and ACT found different levels of college and career readiness among students in the Class of 2017 also suggests caution in laying too much stock in the accuracy of their benchmarks.

Percentage of Students in Class of 2017 Who Met Benchmarks
EXAM MATH ENGLISH READING EVIDENCE-BASED READING AND WRITING COMPOSITE/COMBINED SCORE
ACT 41% 61% 47% n/a 27%
SAT 49% n/a n/a 70% 46%

The most striking difference is in the English Language Arts outcomes.  The SAT combines reading comprehension and writing into one score, while the ACT reports them separately.  If we average the ELA percentiles on the ACT, we get 54%, which is significantly lower than the 70% of students hitting the ELA benchmark on the SAT. So, who is right?  Are around half of American students prepared for college English courses, as ACT contends, or almost three-fourths of them, as The College Board says?

Some of this disparity may be attributed to different testing populations and how the companies determine their benchmarks. While both tests have increased their geographical range in recent years, the ACT remains the dominant test in the south and midwest (although their dominance is waning in the latter).  The SAT is the more popular test in the northeast and west coast.  We might just be seeing different levels of readiness, depending on where students live.

Or it could be that the ACT Reading benchmark is measured against first-year social science courses, while the SAT Reading benchmark is measured against first-year composition courses, which is what ACT uses for its English benchmark. The stark difference between the percentage of students meeting the benchmark in math versus english is explained in part because English Composition grades tend to be less harsh than STEM course grades; writing teachers may use grades to motivate improvement relative to a starting point.

The point here is that the SAT and ACT don’t agree on how many students meet the benchmarks or how they should be determined. That’s just another reason to treat them with caution.

Consider, too, how generic the “readiness” designation is. The benchmark is based on a sampling of schools, but does it take into account differences between schools? Or the difference between four-year colleges, two-year colleges and the training programs that are presumably the measure against which career-readiness is measured? If someone is ready for community college, is she equally ready for a four-year college or a program in carpentry or cosmetology? We do not know. The benchmarks are one-size-fits-all.

The lack of nuance around discussions of benchmarks is not helped by The College Board’s reporting of benchmarks with a traffic light scheme on student and counselor reports.

SAT Traffic Light Benchmarks

The benefit of these colors is that they are easy to interpret; the problem is that they are just as easy to misinterpret. The benchmarks might be too encouraging. Green means go, as in off you go to college! but the reality is that a 480 in Evidence-based Reading and Writing means that while you hit the benchmark, your score is actually below the national average.

Worse yet, the benchmarks might be discouraging, particularly for students who already face too many roadblocks on the way to higher education. Students and families could be getting the wrong signal from admissions tests that have been touted as engines of opportunity. There’s good research indicating that requiring college entrance exams can convince a student to enroll in college, but could it also do the opposite? Given the real and perceived importance of the tests in the application process, it’s certainly possible that a test score in the red could convince someone debating whether to apply to college not to do so. The small print on students’ downloaded score reports does tell them, “If you score below the benchmark, you can use the feedback and tips in your report to get back on track,” but how many students are reading the small print?

But what do these colors mean?

We’ve established that meeting a benchmark means getting a score that predicts a 75% chance of scoring a C or higher in a corresponding first-year college level class. But what does it mean to be in the yellow or the red?

The colors measure whether a student is on track for his grade. The College Board didn’t just make benchmarks for the SAT, however. It calculated what score would be needed to hit the benchmark for eighth graders, ninth graders, tenth graders, and eleventh graders.

College Board benchmarks, by grade.

This table is essential for understanding where a score falls on the benchmark spectrum. If you match or beat the benchmark for your grade, you are in the green. If you are below the benchmark for your grade but above the benchmark for the grade below you, you are in the yellow. If you are below the benchmark for the grade below you, you are in the red.

So if you scored a 420 on Math on the PSAT in the eleventh grade, you’re in the red, because 480 is the tenthth grade benchmark. If you scored 490, you’re in the yellow because you beat the tenth grade benchmark, but not your own grade.

The yellow part of the benchmark spectrum is very small. That is because it reflects the amount of improvement a student is expected to make in a year, and that’s not a lot (roughly 20-30 points per year). What this means is that the difference between being told you’re ready for college and being told you’re not can be quite small. A student might need to get only 2 or 3 more questions right on a section to go from being “not ready” for college to being “ready.” For some students that difference could be a matter of leaving questions blank on the exam. Not all students know that there is no penalty for guessing on the SAT, so they sometimes leave points (and benchmarks) on the table.

The SAT also provides Test and Subscore benchmarks (e.g., Reading Comp, Analysis of Science, Heart of Algebra) and uses the red/yellow/green indicators, but they aren’t really benchmarks. They are based on average performance on the test by grade, not on an external measurement (e.g., grade in a college course). A real benchmark is not comparative.

But what about career readiness? We haven’t really addressed how the benchmarks measure “career” readiness much at all. The reason is that ACT and College Board have also been very quiet about the “career” component. The basic assertion in labeling these “college and career” benchmarks seems to be that jobs require reading and math. We can all give a hearty “well, duh!” to that sentiment, but how much math is needed? What kind? What level of reading and writing skills are necessary? And for what jobs? ACT has at least done some work to define what “career” means in college and career readiness. A 2006 report (recently revised) defines a career as a job requiring less than a bachelor’s degree.

ACT has recently revised the way it defines the career readiness standard and created its own certificate of career readiness. The mechanism for determining whether a person has earned a certificate is a series of ACT-designed exams known as WorkKeys, which test students on Applied Mathematics, Reading for Information, and Locating Information. Students’ certificate levels are determined by their performance on the exams. The ACT exam’s Career Readiness benchmark, according to an ACT white paper, indicates a student’s progress toward one of these certificates.  In other words, the ACT exam career benchmark is tied to a different ACT exam and nothing else.  It even made a chart!

ACT Career Readiness, 2014

This circular logic, believe it or not, is more than the College Board provides.  It never explains how the SAT benchmarks are tied to work success.

What Does All This Mean?

The upshot of all of this is that students, families, and counselors probably should not give much, if any, consideration to benchmarks.  They are ill-defined and will likely play no role in college admissions decisions; they seem to provide no useful information about finding an occupation. Because the benchmarks add little of value for students and might play a discouraging role, we encourage The College Board and ACT to remove them from score reports and report them at the district-level only.

Akil Bello, Director of Equity and Access, and James Murphy, Director of National Outreach

 

 

One thought on “Why’s Everyone Talking About Benchmarks?

Leave a Reply