Reading scores slip in Fort Worth ISD, statewide on newly reformatted STAAR exam

Reading scores in the Fort Worth Independent School District slipped a few points on this year’s state test, according to results released Wednesday by the Texas Education Agency.

This year marked the first time students took a newly reformatted exam, and the first time every student in the state took the test online. Because of the change, education experts caution against making too much of comparisons with results from previous years.

This year, 32% of Fort Worth ISD’s third-graders scored on grade level in reading, and 60% either approached or met grade level, according to the results. That represents a decline of a few points from last year, when 38% met grade level and 66% either approached or met grade level.

Statewide, the percentage of third-graders who met grade level in reading slipped two points compared to last year, from 50% in the spring of 2022 to 48% last year.

Education researchers and policy makers often say third grade is the time students stop reading to learn and start learning to read, making it a critical point in their educational careers. Students who don’t read proficiently by third grade are more likely to struggle later on in school, and less likely to graduate from high school, researchers say.

In math, third-graders held steady compared to last year, with 27% scoring on grade level both years. This year, 57% either approached or met grade level, a slight uptick from 55% last year.

New test format could affect results, UT prof says

TEA officials have said the new STAAR format is designed to more closely mirror the instruction students get in class. The new test includes fewer multiple choice questions, more new question types and reading passages that incorporate information students should have learned in other classes, like social studies and science. It also includes a number of new types of questions, including constructed response questions, in which students are asked to write a few sentences in response to a prompt, “hot text” questions that ask students to cite evidence by highlighting lines in a paragraph and multiple choice questions in which students may select more than one correct answer.

Sarah Woulfin, a professor of education leadership and policy in the College of Education at the University of Texas at Austin, said education leaders should use caution when comparing this year’s results to last year’s. Even if the test isn’t substantially harder than it was last year, it’s still different enough that changes could affect results, especially in the first year, she said.

There’s “a robust evidence base” that taking a test online is a different experience from taking a test on paper, she said. In part, that difference comes from the fact that some strategies that kids might use on a paper test don’t work in a digital format, Woulfin said. When students take a pencil-and-paper test, they might underline or circle certain words or sections of a reading passage, or cross out incorrect answers on multiple choice questions, she said. But they can’t do that on a computer screen. That difference shows up not only in reading and language arts, but also in math sections that involve word problems, she said.

Another key difference this year was the addition of evidence-based writing questions, in which students were asked to read passages and write essays in response to a prompt, using evidence from the text they read to support their arguments. Woulfin said she saw districts across Texas training teachers on how to teach evidence-based writing in December and January, a few months before students took the test. But that change represents a major shift in how many teachers have taught students to write in the past, Woulfin said, and it may take years for the full effects of that change to show up in student achievement data.

“It’s going to take more than a few months of instruction,” she said.

The new exam also uses cross-curricular reading passages, meaning students are asked to read and answer questions about short texts dealing with topics they should have learned in other subjects, like science and social studies. That happened to a certain extent in the old version of the test, Woulfin said, but it wasn’t a major area of emphasis.

Districts across the state are moving toward reading curricula that are based in the science of reading, and most are using classroom materials that weave in cross-curricular reading selections like the ones students see on the test. But that’s a relatively recent development, and Woulfin said many districts are still in the early stages of rolling out those materials. If students haven’t had much exposure to those kinds of questions, their performance on the test could suffer as a result of it, she said.

Woulfin said she suspects districts’ decision to adopt that style of reading materials was motivated at least in part by changes to the state assessment. While she thinks it’s worrisome when districts tailor their instruction too closely to tests, she said she thinks students will ultimately benefit from reading instruction that touches on a wide variety of topics, regardless of why their school uses it.

North Texas lawmaker proposes doing away with STAAR

Researchers and politicians have narrowed in on reported flaws of the STAAR test in recent years. A 2019 study from the Meadows Center for Preventing Educational Risk at UT Austin, which found inconsistencies in the readability of test items, was at the heart of a proposed bill to provide an alternative to STAAR.

Rep. Matt Shaheen, a Republican who represents West Plano and far North Dallas, filed House Bill 680 during this year’s legislative session to create a new testing model that would be “adaptive” and “growth-based.”

“What adaptive testing does is it looks to see how well you’re answering the previous questions, and it’ll ‘adapt’ the questioning based on how you’re doing on the previous questions. The benefit of that is it arms the teachers with more information with respect to each of the students and where they need some additional help,” Shaheen told the Star-Telegram on Tuesday.

Rather than a fixed set of questions given in the spring with results being released after the school year, he said, there would be multiple, shorter tests throughout the year that would allow teachers to see the results sooner and then use them to track students’ progression.

End-of-course assessments, or EOCs, are given throughout the year, and the number of tests a student takes depends on their grade, according to the Texas Education Agency. Most students have two to four testing days during the school year.

“It’s important that it doesn’t increase the amount of testing… but it does allow the teachers to look at the progress of their students,” Shaheen said of the proposed alternative.

The 2019 UT Austin study aligns with the concerns he’s heard for years from teachers and principals in his district, Shaheen said.

Although the study determined STAAR mostly met state content standards, it concluded that “analyzing item readability in a reliable manner” was unachievable.

“Unless and until additional research provides clear guidance and evidence of a reliable way to evaluate item readability, we cannot recommend conducting analyses of the grade-level readability of test items,” the study states.

The study revealed that results shifted substantially when analyzing whether test items met reading grade levels. Measuring word and sentence length and difficulty, syntax and vocabulary load were among the metrics used. But readability is only one component of measuring a test question’s difficulty, and is not the main component, according to the study’s authors.

Shaheen’s proposal to do away with the exam has yet to gain traction. This year’s regular session was Shaheen’s second time introducing it, but he plans to bring it back during the state’s special session slated for this fall. If not in the fall, then in 2025.

“It’s always difficult to do a significant amount of change, and that’s what this bill does,” he said. “There’s always resistance to change, and quite frankly, big ideas often take multiple sessions.”

Another 2019 study by Texas A&M University-Commerce professors determined that, in most cases, STAAR tests had difficulty levels one to two years ahead of the grade level being assessed, when looking at readability.

“Thus, it is believed that many students may be failing the STAAR test because the passages are written above their grade level,” according to the study.

“Failing high-stakes tests, such as the STAAR, affects students, teachers, and districts in many ways, including the costs of remediation and tutoring programs and materials.The label of ‘failure’ hurts the self-esteem and morale of students and teachers when they are doing their best to cover and learn the material that needs to be taught at each grade level,” the study states.

The study expanded on a previous 2012 study and looked specifically at the 2018 STAAR reading tests for grades 3-8. Readability scores were calculated using eight different scales, which determined an average for each reading passage. Those average scores were then used to calculate the overall readability average for the reading tests at each grade level.

“This study found that some characteristics of the STAAR test had changed since 2012, but many of the reading passages were still misaligned for the targeted grade level,” researchers wrote.