Don't reconsider. The relationship between content knowledge and comprehension stands.
Olivia Mullins responds to a recent meta-analysis on reading comprehension.
This is a guest post by Olivia Mullins, the Executive Director of Science Delivered, where she is developing science materials grounded in literacy practices. It’s more academic than our usual fare, but it is a response to a journal article, and Dr. Mullins was a neuroscientist, after all. We invite questions in the comments.
Recently, Nate Hansford and colleagues published Reading comprehension: a meta-analysis comparing standardized and non-standardized assessment results. This study took a deep dive into comprehension research, reexamining and reanalyzing studies cited in three previous meta-analyses, including Hwang et al., 2022, which examined content and literacy integration.
Hansford et al. brought fresh lenses to this earlier comprehension research. Perhaps most crucially, they dissected how various types of comprehension instruction affected researcher-created vs. standardized measures. (Spoiler: nearly all effect sizes decreased substantially when looking at standardized measures). They also broke the research down by age—another important distinction.
As a knowledge proponent, I was curious to see how the paper handled the data around the effects of content instruction on comprehension. Given that knowledge was not the sole focus of the paper, I thought that the authors were adequately cautious with the conclusions they drew from the included data. For example, they wrote: “Conclusions drawn here should not be interpreted as direct evidence for or against knowledge-building” (pg 38) and “This current meta-analysis would suggest that content-based instruction should be included in the literacy block” (pg 65).
However, authors’ cautions are not always heeded, and it appears the work is being misinterpreted and used to draw faulty conclusions, based on early chatter on social media.
Additionally, in his blog announcing the paper, while mitigating statements were included, Hansford sent negative signals about the value of background knowledge and content instruction. The nuances in the paper felt lost in translation.
Meta-analyses, while being great organizing tools, can draw too broad of a brush and average out real effects. This is especially true if the papers included don’t all use the appropriate methodology for examining the question at hand.
In this post, I argue that the collection of papers on content knowledge and comprehension included in Hansford et al. are not an appropriate set of papers to draw conclusions on the effect of knowledge-building on comprehension. Critical papers are not included and some included papers have dubious methodology.
Additionally, the data presented in Hansford et al. needs to be taken in context. The broader research base clearly shows that long-term, robust and effective content instruction will have meaningful effects on reading comprehension and literacy skills.
Ten reasons to look beyond this meta-analysis
Here are reasons that the Hansford et al. meta-analysis should not cause us to reconsider the impact of content on comprehension.
Issues 1-2 are the strongest critiques of the paper, questioning the exclusion of certain important papers.
Issues 3-5 address paper limitations. While these could be considered critiques, I recognize that this paper was a major effort that examined eight instructional techniques, and content was not the only focus. Hansford et al. did not have space to address every nuance in the content instruction papers, but these critical points can be examined here.
Issues 6-10 address important evidence and considerations that are outside the scope of Hansford’s paper, but fully in-scope for anyone invested in quality curriculum.
The issues with the Hansford et al. meta-analysis:
Important, seemingly eligible studies were inadvertently excluded. Most glaring, the body of work by Romance & Vitale was not included. These studies are perhaps some of the cleanest we have, methods-wise, and also have strong results.
There are other issues with paper selections and classifications. None of James Kim’s work is included. At least one paper examined for content is more appropriately classified as a content-teaching-strategy paper.
For most analyses, study duration is not considered. Content/comprehension studies in Hwang et al. (where most of the content studies were pulled from) range from three days to one school year. Duration, a critical variable, is neglected.
Effectiveness of the content instruction is not considered. Few studies use explicit instruction and some do not measure if the treatment group truly learned more content than the control group.
Text-type was not examined. As discussed in the limitations section of the meta-analysis, academic knowledge will have the strongest effects on domain-specific text, but text-type was not considered.
Correlational studies are excluded. Reasonably enough, only experimental and quasi-experimental studies are included. But this means that the enormous research base of controlled, correlational studies is not considered. Given what we know about learning content, correlational data cannot be hand-waved away.
Other important findings are not considered. Several studies have found positive effects of content instruction on writing and at least two studies found effects that lasted over a year after studies’ conclusion.
A content-approach to text is best practice. Studies show that focusing on content and making meaning of a text is more effective than a “strategy of the week” approach to comprehension.
Strategy instruction did not have a robust showing. Hansford et al. found generally modest effects across the board for comprehension instruction methods.
Knowledge-building is the only reasonable choice. There is simply no competing, coherent argument for scattered, content-poor approaches to literacy instruction. We should do what’s best for student learning, not gatekeep the ELA block.
These are my concerns to date. I just gained access to the full data set, and I’ll keep exploring the findings.
Here is a deeper dive into the issues above:
1. Important, eligible studies were inadvertently excluded
Perhaps the biggest critique of the Hansford et al. study is the seemingly-accidental exclusion of a series of papers written by Romance & Vitale (1992, Vitale & Romance 2011, 2012).1 These papers were cited in the Hwang et al. meta-analysis, and they fit Hansford’s criteria, so their exclusion is surprising.
These omitted papers report mostly moderate to strong effect sizes (0.22 - 0.91) and two of the studies are a year long. (Long-term instruction is considered necessary to see an effect of content instruction on comprehension).
The exclusion of Romance & Vitale (1992) is particularly disappointing, as this is probably the cleanest (quasi) experimental study that exists. Here, the three treatment classrooms received two hours of science and literacy instruction per day for an entire school year. Four control classrooms received 30 minutes of science instruction and 1.5 hours of typical ELA instruction from a basal curriculum. Teachers in the treatment and control groups had matched ELA goals; the difference was that one group delivered these goals within cohesive content.
Critically, students in the treatment group outperformed the control group on the Iowa Test of Basic Skills (ITBS) which was implemented at the district level. The effect size was 0.56.2
Given that some of the analyses done in Hansford et al. had only a small number of papers, the inclusion of the Romance & Vitale papers may have changed some of the study conclusions.
2. Other issues with paper selections and classifications
None of James Kim’s work apparently met the inclusion criteria. These papers have lower effect sizes, so it may not have changed the results much, but it is still a curious omission. Another paper with a strong set of control groups and mostly high effect sizes, Guthrie et al. (2004), also did not make the cut. There were other Hwang et al. papers not included.
In terms of papers that were included, the classification of the Simmons et al. (2010) in the content category is questionable. Simmons et al. is also one one of the four studies examined for the effect of core content instruction on standardized assessments. Hansford et al. writes that in Simmons et al., “the only experimental variable was increased background knowledge.” But this is not accurate. This study has three groups and all worked from the same social studies curriculum, while the method of instruction was varied. Both treatment groups learned more content, and had increased comprehension, but these were study outcomes (see #8 below). Simmons et al. had positive, but insignificant, effect sizes on comprehension.
Looking at the core content analysis category, which is described in more detail than other content analyses, two of the papers probably shouldn’t be in this category (Simmons et al., Aarnoutse & Schellingssee (see #4 below)), while other critical papers (Romance & Vitale, 1992; Vitale & Romance, 2012) are not included. A different basis of paper inclusion would perhaps have yielded quite different results.
It is also worth noting that this study removed outlier effects, which generally lowered the effect sizes overall. The lead author shared that a different approach to outliers, Windsoization, may have been preferable.
I welcome feedback and additional context from lead author, Nate Hansford, on this section.3
3. For most analyses, study duration is not considered
Knowledge building is not a short intervention. In fact, it’s not an intervention at all. It’s a gradual, cumulative, lifelong process. We already understand that short bursts of content instruction will not affect general reading comprehension.
I understand that Hansford et al. choose to use the citation list from Hwang et al., and I don’t expect them to be more nuanced about knowledge than a literacy and content-focused study (i.e. Hwang et al). But given that knowledge building is a years-long process, it’s not especially useful to average together results from studies that last eight weeks with those lasting a school year.
It’s also relevant to note that Hwang et al. considered the effects of integrated content and literacy instruction, and did not attempt to isolate the effect of content instruction.
Importantly, even with some key studies missing, as shared above, Hansford et al. found a significant positive correlation (r = 0.43) between instruction duration and effect size.4 Content instruction was the only treatment where this effect was found (all other significant correlations were negative).
4. Effectiveness of the content instruction is not considered
Unfortunately, a large number of studies examining content instruction and comprehension use a student-led approach, which is less effective than direct instruction for novice learners. This issue is tempered in some, but not all, studies by the inclusion of aspects that do improve science learning, like hands-on activities and additional literacy elements.
One content analysis in Hansford et al. looks at how “core” instruction affected standardized measures. Only four of their studies fit into this criteria. They found a “small to moderate,” but insignificant effect. The lack of significance is attributed to the inclusion of Aarnoutse & Schellings (2003), which had a negative effect size.
But a look at the methods in the Aarnoutse & Schellings paper make it clear that this study should not be included in the analysis at all. Hansford et al. writes that this paper had a confounding factor, strategy instruction. But the real issue is that the “content instruction” in the treatment group was so poor that I wouldn’t be surprised if the control group learned more content. The “content instruction” was that students choose a research question and picked out and read their own books (from a library that apparently was “not well-equipped.”) Students then engaged in some sort of self-directed research project. This is not good content instruction.
Again, it’s difficult to be too critical of Hansford et al. for this inclusion, given that this study was cited in Hwang et al. Yet Aarnoutse & Schellings should be removed from any future comprehension/content discussion.
The student-led instruction issue is pervasive within the content/comprehension instruction research, and is probably impossible to avoid in a meta-analysis, but it’s important to recognize that very few of these studies use an explicit instruction approach. While many such studies do measure and show an increase in content-knowledge (good), I hypothesize that effect sizes would be bigger if best practices were used in all aspects of content instruction.
5. Text-type was not examined
The study looked at the effects on comprehension generally, and did not examine if there were differential effects on narrative vs. expository text. This is mentioned in the limitations section, but is worth mentioning here, too.
Research, and logic, tells us that academic knowledge is more critical for domain-specific or content-rich text. “General” reading assessments full of content-light passages may mask the importance of content knowledge for more technical texts.
There’s nuance beyond expository vs. fictional text. While a science unit on birds should make students more capable of reading a related bird passage, even if the passage does not include directly-taught content (as seen in Cervetti et al., 2016), it is unclear whether it would help much with comprehending a passage on, say, bridges.5
Note that standardized measures are nearly always going to be general reading tests, but the proximal effects reported in Hansford et al. are a mix of domain-specific and domain-general text. Many seem to take “general” reading comprehension measures as the only one that matter, but domain-specific results are highly relevant. After all, much of the text read in school is academic and domain-specific.
6. Correlational studies were excluded
Yes, the correlational data matters.
I can hear people objecting to this. But the correlational data is not simply “students who know more can comprehend better.” There is an enormous amount of correlational data showing that knowledge correlates with domain-specific text above and beyond the correlation with comprehension of content-light text.6
The most famous study of this type is the Baseball Study (Leslie & Recht, 1988). I know some people are sick of the Caseball Study! But there are so many studies like this that I got bored of reading them.
It has been shown at multiple ages and with text in multiple topics that domain-specific knowledge correlates with comprehension with domain-specific text. These results, and the conclusion that knowledge is important for comprehension, are uncontroversial. (A note that correlational studies overwhelmingly show both knowledge and reading skill contributing to comprehension, the Baseball Study is an outlier in that regard).
To argue that content instruction will not boost reading comprehension assumes that school is incapable of teaching students a critical mass of knowledge. Students are in elementary school for six years. It is ludicrous to argue that students cannot learn a critical mass of knowledge with six years of daily content instruction. There’s more to say on this topic, which will be addressed in a future post.
7. Other important findings are not considered
Knowledge is not the only focus in the Hansford et al. study, and they did not have room to include every detail for the eight different strategy treatments they examined. Yet, people often treat meta-analyses as the final word on a subject, so it’s notable to add a few other content-related results.
Importantly, several studies have found positive effects on content instruction in writing assessments (e.g. Kim et al., 2021, Cervetti et al., 2012). While this is not reading comprehension, it is germane to literacy, and we know reading and writing are reciprocal skills. Increased domain-specific vocabulary knowledge and content-knowledge, when measured, is of course an outcome of most of these studies as well.
While I’d have to dig to see if there’s data on this, I’ve written about the exciting possibilities in merging content instruction and oral language instruction in early elementary.
Another intriguing note is that Kim et al. (2024) found that a multi-year curriculum showed modest gains in reading comprehension that were retained fourteen months after the study-based instruction concluded. Romance & Vitale (2017) similarly found that elementary integrated science and literacy instruction resulted in positive effects on standardized reading scores in middle school. The potential longevity of knowledge-building impact is an important consideration.
8. A “content-approach” to text is best practice
Numerous studies consider content and literacy integration from angles beyond those discussed here. In these studies, the amount of content taught isn’t increased, but instructional strategy for teaching the content is varied. Simmons et al. (2010), mentioned in #2 above, is one such study.
Studies like Vaughn et al.(2013) and McKeown et al. (2009) found that a traditional “strategy of the week” approach was inferior to an approach that focused on content and making meaning of the text for reading comprehension. (This is described in more detail in Part 4 of a previous post). This is similar to how Timothy Shanahan, and many others, recommend engaging with text.
A focus on content enhances literacy from every angle. The cohesion promoted by knowledge-building advocates can only enhance reading comprehension.
9. Strategies did not have a robust showing
The potential of strategy instruction relies on the theory that strategies are generalizable. Therefore, the standardized assessment measure is most relevant for strategy instruction.
Yet Hansford et al. found low effect sizes for comprehension instruction in standardized assessments, almost across the board. Meta-cognition strategies were not significant, while cognitive strategies instruction had a weighted effect size of only 0.09. (Content instruction showed no effect for this measure, for reasons explained above).
The only strong result for standardized measures was found for reciprocal reading, which cannot be used with early elementary students.
From this, we can conclude that Hansford et al. did not produce evidence for any obvious alternative to knowledge building curriculum in elementary school.
Which leads us to the final point...
10. Knowledge-building is the only reasonable choice
Let’s pretend that content instruction truly has no effect on reading comprehension. Even in this imaginary scenario, we should still use knowledge-building curricula.
It is clear that integrated content instruction doesn’t hurt comprehension. Therefore, schools have the following two choices:
A. Do literacy instruction in a way where students will learn content.
B. Do the same literacy instruction in a way where students will not learn content.
Why would anyone choose B?
Conclusions
The Hansford et al. paper was clearly a massive effort and provides us with a wealth of information. That said, this paper should not be used as an argument against knowledge-building curriculum or the importance of content for comprehension, given the inclusion of some papers that use dubious methodologies or that don’t fit the category, as well as the inadvertent exclusion of certain foundational papers.
A vast research base supports the importance of content knowledge for comprehension. It includes correlational studies with statistical controls, quasi-experimental & experimental studies on general and domain-specific text, and emerging data on specific knowledge-building programs. We also see enhanced comprehension effects when content is a focus of reading. This evidence base deserves a more detailed exploration in future articles.
I am grateful to the efforts of Hansford et al. and will continue to dig into the many effects reported there. I hope that the paper will be used responsibly and in context when it comes to the importance of knowledge-building for reading comprehension and literacy.
Moving forward, I would love to leave behind the debate on whether we should build knowledge in schools (yes we should), and instead discuss how best to do it. Concerns about implementation, messaging, and programs are all valid and should be the focus of the discourse.
Additionally, the ineffective student-led learning methods promoted by NGSS create a true barrier to the goals of this advocacy. We should be teaching content using best practices.
A final note: this article focuses on content learning, but of course content learning should not never come at the expense of high-quality fiction instruction. Sometimes you can even do both at once.
Additional Reading:
Natalie Wexler also noted “the strong undercurrent of skepticism” about content instruction in the paper and cited similar issues with its methodologies.
In addition, Chris Such has questioned the omission of a study on related reading, and suggested that this omission inflates the impact of Reciprocal Reading.
Romance & Vitale (2017) was not included in Hwang et al. (2020) as this meta-analysis was limited to elementary school and the 2017 paper looked at results in middle school students. Romance & Vitale was likely eligible for Hansford et al., as well.
I looked at Romance & Vitale (1992) a while back before understanding how rare of a study it was. I had been intending to reexamine it, among others, but kept putting it off. What a regret to have gone through all these months of knowledge debates without citing this paper.
More information on how studies were coded and certain exclusion information are available at this link, but I was unable to access the data for technical reasons in time to publish this post.
Hwang et al (2020) reports quite a few shorter studies (e.g. 8-12 weeks) with positive effects. Results like this almost certainly have to be due to one of two things (or both). 1. The assessment is domain-specific. Students should be capable of learning enough content in a couple months to impact comprehension of a relevant text. 2. Literacy or strategy instruction done on cohesive content may elevate general reading ability beyond what you would see from the ELA instruction and content instruction in isolation. It seems highly unlikely that students could learn enough content in 2-3 months that the increased content knowledge itself would affect results on a general reading test.
It is possible that a “content-strategy approach” to reading text will have transfer (e.g. PACT Vaughn studies), but this is slightly different than knowledge itself supporting comprehension. So knowing a lot about birds will lead to better understanding for bird related text, but not a text about bridges. But a unit spent using a “content-strategy approach” to reading text about birds may indeed lead to better comprehension of non-bird related text.
See Part 3 for references on correlational data.




Thanks so much for taking time to write such a comprehensive analysis. Those of us who are not statistically-savvy can only watch the back-and-forth from the sidelines and are unable to evaluate the research for ourselves. This is a serious drawback when it comes to drawing conclusions. What's unfortunate is that it's not so much that 'truth' is the first casualty in this sort of exchange, it's contextualization that's the first casualty. Statements like "A content-approach to text is best practice. Studies show that focusing on content and making meaning of a text is more effective than a 'strategy of the week' approach to comprehension" are COMPLETELY unhelpful because I don't know of anyone on any 'side' who is arguing for this approach. I certainly am not, which is why it took me four posts to try and tease out all the complexities of this issue, which I wrote about most recently in Fahrenheit 451: The Temperature at Which Discussions about Reading Comprehension Catch Fire (https://harriettjanetos.substack.com/p/fahrenheit-451-the-temperature-at?r=5spuf). Thanks again! I always value your input.
Thank you for this thorough, helpful analysis! You write, "To argue that content instruction will not boost reading comprehension assumes that school is incapable of teaching students a critical mass of knowledge." I have had that same thought when listening to the confusing discussions about this topic, and I agree it should give us pause. I'm looking forward to hearing more of your thoughts about this.