Computer grading is here for STAAR essays. Should Fort Worth school leaders worry?

Silas Allen

February 24, 2024 at 9:00 AM·7 min read

Having just adapted to a newly reformatted state test, school leaders across Texas are now looking at a new change in how their students are assessed: computer-based scoring.

The Texas Education Agency rolled out the new “automated scoring engine,” a computer-based grading system, in December, the Dallas Morning News reported. Following the change, about three-quarters of all essay questions will be scored by a computer program rather than human scorers.

School district leaders in the Fort Worth area say it’s too soon for them to tell whether the new grading system is a cause for concern. But some say they need more information about the new system.

“I think anytime a computer program is going to take on grading of something of this magnitude, I think it is concerning,” said Jennifer Price, chief academic officer for the Keller Independent School District.

Automated scoring comes amid STAAR reformat

The new scoring engine comes amid broader changes to the state test. Last year, the Texas Education Agency rolled out a newly revamped STAAR exam that includes more writing prompts and fewer multiple choice questions than previous versions. State education officials say the new test is designed to more closely mirror instruction students get in the classroom.

But open-ended responses like essays also take longer to score than multiple choice questions. TEA officials said using computer-based scoring in combination with human scorers allows the agency to score tests and get results back to districts more quickly and cheaply.

Chris Rozunik, director of the agency’s student assessment division, said the computer program scores exams based on the same rubric that human graders use. The agency is also using human-scored sample papers to train the engine on what to look for in students’ responses, she said.

Rozunik said the new engine isn’t an AI system with broad capabilities like ChatGPT, but rather a computer-based scoring system with narrow parameters. She noted the agency has used machine scoring for closed-ended questions like multiple choice prompts for years.

The agency is committed to having human scores evaluate 25% of all essays, she said. The essays graded by humans include those the computer program can’t make sense of, and also a certain number the agency randomly assigns to human scorers, she said.

The reasons the computer program might kick an essay to human graders are varied, Rozunik said. If a student enters a series of random letters instead of an answer, the computer won’t understand how to evaluate it. But real answers, even good ones, can also baffle a computer program. If a student answers a question in a language other than English, the essay will end up being referred to a human, she said. Likewise, if a student gives an answer that is thoughtful and creative, but doesn’t come in a form the computer recognizes, their answer will go to a human, who will be better able to score it appropriately, she said.

“We do not penalize kids for unique thinking,” she said.

The agency is already facing a lawsuit brought by several school districts, including the Fort Worth and Crowley independent school districts, over the state’s A-F accountability system, which is primarily based on STAAR scores. Last October, a state district judge temporarily blocked the agency from releasing that year’s A-F scores.

Fort Worth school officials want more clarity on scoring change

Price, the Keller ISD administrator, said she’s worried about what guardrails are in place for the new automated system. State education officials say the exam is no longer a high-stakes test for students, since their performance doesn’t have any bearing on whether they go on to the next grade. But STAAR scores are still a high-stakes matter for school districts, since they’re the main factor in accountability ratings. Those scores can affect how parents perceive their school districts or campuses, ultimately influencing their decision about where to enroll their kids.

Given those stakes, Price doesn’t think state education officials have given districts enough information about how the new system works. The district has known the change was coming for about a year, she said, but TEA has given districts only limited details about what it would look like.

Melissa DeSimone, executive director of research, assessment and accountability for the Northwest Independent School District, said she doesn’t have enough data yet to know whether the new scoring system is a cause for concern. So far, TEA has only used the automated engines to score last December’s end-of-course exams. The district has gotten raw scores from that round of testing, she said, but hasn’t yet received students’ responses to test questions. Districts should get those responses sometime in late March, she said. At that point, the district can go through students’ answers and see if they were scored appropriately, she said.

If the district does find discrepancies between the scores that students received and the quality of their responses, officials can request that those tests be reevaluated by a human score, DeSimone said. The drawback is that those requests cost the district about $50 each if the scores come back the same, she said. The agency waives that fee if human scorers rate the response differently than the computer did.

District leaders have known that automated scoring was coming since the early part of last year, DeSimone said. The district didn’t adjust any of its test preparation because the automated scoring system is supposed to be based on the same rubric as human scoring, she said.

Fort Worth ISD officials weren’t available for an interview for this story. In an email, Melissa Kelly, the district’s associate superintendent of learning and leading, said there’s “a significant level of uncertainty” around how the new system will work.

So far, the district isn’t planning any major changes in response to the new scoring system, Kelly said. District leaders will stay focused on teaching Texas’ state-mandated standards and wait to see what results come out of the scoring change, she said.

Testing expert says automated scoring is growing

Kurt Geisinger, director of the Buros Center for Testing at the University of Nebraska–Lincoln, said the shift to automated grading shouldn’t be a big cause for concern for local school districts. Automated grading of essays is becoming more common across the country, he said, and for the most part, it’s been implemented without major problems.

A few years ago, Geisinger served as board chairman for the Graduate Review Examinations, an admissions test used for graduate schools across the country. At the time, the testing organization shifted to a hybrid AI-human grading model, where each test would be scored by both a computer and a human, he said. The organization found that the AI program did about as well as the human grader, he said.

Geisinger said one of the admissions exams in use across the country — he wouldn’t say which test — is graded at least in part using AI. The grading program analyzes essays based on about 40 different criteria, he said. But the three factors that end up being most critical to the final score are the length of the essay, the number of paragraphs and the average word length, he said. That means those tests aren’t so much measuring the quality of writing as a few factors that often correlate with good writing, he said.

Using those factors as a proxy for judging the quality of writing has some drawbacks, Geisinger said. If a test-taker uses longer words, it can be a sign of a larger vocabulary, he said. But the awkward use of big words makes for bad writing. If an AI system can’t tell whether the test-taker uses those words correctly, it may struggle to tell good writing from bad writing, he said.

Geisinger said some professors are also concerned about whether creativity in writing gets lost in the shift to AI grading, although he said he hasn’t seen any research to validate those concerns.

“I’ve heard English scholars say they wonder how someone like James Joyce would do on an AI-scored (test),” he said.

Yahoo Sports
NFL Draft: Packers fan upset with team's 1st pick, and Lions fans hilariously rubbed it in
Not everyone was thrilled with their team's draft on Thursday night.
2d ago
Yahoo Sports
NFL Draft: Bears take Iowa punter, who immediately receives funny text from Caleb Williams
There haven't been many punters drafted in the fourth round or higher like Tory Taylor just was. Chicago's No. 1 overall pick welcomed him in unique fashion.
13h ago
Yahoo Sports
NFL to allow players to wear protective Guardian Caps in games beginning with 2024 season
The NFL will allow players to wear protective Guardian Caps during games beginning with the 2024 season. The caps were previously mandated for practices.
1d ago
Yahoo Sports
NFL Draft: Spencer Rattler's long wait ends, as Saints draft him in the 5th round
Spencer Rattler once looked like a good bet to be a first-round pick.
12h ago
Yahoo Sports
Cowboys owner Jerry Jones compared his 2024 NFL Draft strategy to robbing a bank
Dallas Cowboys owner Jerry Jones made an amusing analogy when asked why the team selected three offensive lineman in the 2024 NFL Draft.
6h ago
Yahoo Sports
Michael Penix Jr. said Kirk Cousins called him after Falcons' surprising draft selection
Atlanta Falcons first-round draft pick Michael Penix Jr. said quarterback Kirk Cousins called him after he was picked No. 8 overall in one of the 2024 NFL Draft's more puzzling selections.
1d ago
Yahoo Sports
Korey Cunningham, former NFL lineman, found dead in New Jersey home at age 28
Cunningham played 31 games in the NFL with the Cardinals, Patriots and Giants.
2d ago
Yahoo Sports
NFL Draft: Brenden Rice, son of Hall of Famer Jerry Rice, picked by Chargers
Brenden Rice played the same position as his legendary father.
9h ago
Yahoo Sports
NFL Draft fashion: Caleb Williams, Malik Nabers dressed to impress, but Marvin Harrison Jr.'s medallion stole the show
Every player was dressed to impress at the 2024 NFL Draft.
2d ago
Yahoo Sports
Based on the odds, here's what the top 10 picks of the NFL Draft will be
What would a mock draft look like using just betting odds?
6d ago
Yahoo Sports
NBA playoffs: Tyrese Hailburton game-winner and potential Damian Lillard Achilles injury leaves Bucks in nightmare
Tyrese Haliburton hit a floater with 1.1 seconds left in overtime to give the Indiana Pacers a 121–118 win over the Milwaukee Bucks. The Pacers lead their first-round playoff series two games to one.
1d ago
Yahoo Sports
Lionel Messi is picking apart MLS at a ridiculous rate
Messi, after two more goals and an assist Saturday, is averaging 2.5 goal contributions per 90 minutes so far this MLS season.
5h ago
Yahoo Sports
Fantasy Baseball Waiver Wire: Widely available players ready to help your squad
Andy Behrens has a fresh batch of priority pickups for fantasy managers looking to close out the week in strong fashion.
2d ago
Yahoo Sports
Dave McCarty, player on 2004 Red Sox championship team, dies 1 week after team's reunion
The Red Sox were already mourning the loss of Tim Wakefield from that 2004 team.
7d ago
Yahoo TV
Everyone's still talking about the 'SNL' Beavis and Butt-Head sketch. Cast members and experts explain why it's an instant classic.
Ryan Gosling, who starred in the skit, couldn't keep a straight face — and neither could some of the "Saturday Night Live" cast.
5d ago
Autoblog
UPS and FedEx find it harder to replace gas guzzlers than expected
Shipping companies like UPS and FedEx are facing uncertainty in U.S. supplies of big, boxy electric step vans they need to replace their gas guzzlers.
3d ago
Yahoo Entertainment
Skelly, Home Depot's 12-foot skeleton, gets a dog — and he's a very good boy
Halloween may be six months away, but Home Depot unveiled some of its latest and greatest decorations, including a dog pal for Skelly, the brand’s gigantic and popular 12-foot skeleton.
2d ago
Yahoo Sports
NFL Draft: Sorry Jim Harbaugh, Michigan RB Blake Corum goes to cross-town Rams
The Rams seemed like an unlikely landing spot for Blake Corum.
1d ago
Yahoo Sports
Arch Manning dominates in the Texas spring game, and Jaden Rashada enters the transfer portal
Dan Wetzel, Ross Dellenger & SI’s Pat Forde react to the huge performance this weekend by Texas QB Arch Manning, Michigan and Notre Dame's spring games, Jaden Rashada entering the transfer portal, and more
5d ago
Yahoo Sports
Chiefs make Andy Reid NFL's highest-paid coach, sign president Mark Donovan, GM Brett Veach to extensions
Reid's deal reportedly runs through 2029 and makes him the highest-paid coach in the NFL.
5d ago

News

Life

Entertainment

Finance

Sports

New on Yahoo

Computer grading is here for STAAR essays. Should Fort Worth school leaders worry?

Automated scoring comes amid STAAR reformat

Fort Worth school officials want more clarity on scoring change

Testing expert says automated scoring is growing

Recommended Stories

NFL Draft: Packers fan upset with team's 1st pick, and Lions fans hilariously rubbed it in

NFL Draft: Bears take Iowa punter, who immediately receives funny text from Caleb Williams

NFL to allow players to wear protective Guardian Caps in games beginning with 2024 season

NFL Draft: Spencer Rattler's long wait ends, as Saints draft him in the 5th round

Cowboys owner Jerry Jones compared his 2024 NFL Draft strategy to robbing a bank

Michael Penix Jr. said Kirk Cousins called him after Falcons' surprising draft selection

Korey Cunningham, former NFL lineman, found dead in New Jersey home at age 28

NFL Draft: Brenden Rice, son of Hall of Famer Jerry Rice, picked by Chargers

NFL Draft fashion: Caleb Williams, Malik Nabers dressed to impress, but Marvin Harrison Jr.'s medallion stole the show

Based on the odds, here's what the top 10 picks of the NFL Draft will be

NBA playoffs: Tyrese Hailburton game-winner and potential Damian Lillard Achilles injury leaves Bucks in nightmare

Lionel Messi is picking apart MLS at a ridiculous rate

Fantasy Baseball Waiver Wire: Widely available players ready to help your squad

Dave McCarty, player on 2004 Red Sox championship team, dies 1 week after team's reunion

Everyone's still talking about the 'SNL' Beavis and Butt-Head sketch. Cast members and experts explain why it's an instant classic.

UPS and FedEx find it harder to replace gas guzzlers than expected

Skelly, Home Depot's 12-foot skeleton, gets a dog — and he's a very good boy

NFL Draft: Sorry Jim Harbaugh, Michigan RB Blake Corum goes to cross-town Rams

Arch Manning dominates in the Texas spring game, and Jaden Rashada enters the transfer portal

Chiefs make Andy Reid NFL's highest-paid coach, sign president Mark Donovan, GM Brett Veach to extensions