readability
formulas Coh-Metrix
& Coh-Git Methods
Timeline
Readability measures are primarily based on factors such as the number of
words in the sentences and the number of letters or syllables per word (i.e.,
as a reflection of word frequency). Two of the most commonly used measures
are the Flesch Reading Ease formula and the Flesch-Kincaid Grade Level.
Flesch reading Ease
The output of the Flesch Reading Ease formula is a number from 0 to 100, with
a higher score indicating easier reading. The average document has a Flesch
Reading Ease score between 6-70. The formula reads as follows:
206.835 – (1.015 x ASL) – (84.6 x ASW)
where:
ASL = average sentence length (the number of words divided by the number of
sentences)
ASW = average number of syllables per word (the number of syllables divided
by the number of words)
Flesch-Kincaid Grade Level
The more common Flesch-Kincaid Grade Level formula converts the Reading Ease
Score to a U.S. grade-school level.
(.39 x ASL) + (11.8 x ASW) – 15.59
where:
ASL = average sentence length (the number of words divided by the number of
sentences)
ASW = average number of syllables per word (the number of syllables divided
by the number of words
In addition, more than 40 readability formulas have been developed over the
years (Klare, 1974-1975). Readability measures guide the construction of textbooks
such that the readability conforms to the intended grade level. However, there
are at least three major problems with readability formulas that prevent valid
predictions of text comprehension.
1. Surface characteristics. Readability scores are based on the surface characteristics
of the text. Comprehension and learning, however, depend to a greater extent
on processing at the textbase and situation levels (Kintsch, Welsch, Schmalhofer
& Zimny, 1990; McNamara et al., 1996). Measuring text elements that are
primarily needed for surface processing does not adequately capture comprehension
and learning, which is the concern of educators. Recent advances in discourse
processing and computational linguistics afford more advanced measures of
readability due to more precise predictions of which text characteristics
improve comprehension and learning.
2. Reader's cognitive aptitudes. Predicting reading, understanding, and learning
requires consideration of the reader’s knowledge, language skills, and
other cognitive aptitudes. Although text characteristics can certainly predict
aspects of readability, readability should be viewed as an interaction between
a text and a reader’s cognitive aptitudes (Kintsch, 1994; Miller &
Kintsch, 1980; McNamara et al. 1996).
3. Cohesion and coherence. Readability formulas cannot capture the cohesion
or coherence of a text. Research has clearly shown that readers have less
difficulty reading cohesive texts (Beck, McKeown, Sinatra, & Loxterman,
1991; Britton & Gulgoz, 1991; Gernsbacher, 1997; Graesser, Gernsbacher
& Goldman, 2002; McNamara, 2001; McNamara & Kintsch, 1996; McNamara
et al., 1996). We would therefore expect greater readability scores for high-cohesion
texts than low-cohesion texts; however, this is not generally the case. In
the following examples the low cohesion sentences have lower or equal Flesch-Kincaid
grades, but are intuitively more difficult to read than the high cohesion
texts. Similarly, the Flesch Reading Ease scores do not necessarily differentiate
between low and high coherence sentences. Hence, traditional readability measure
can run orthogonal to cohesion measures. Average sentence length and average
number of syllables per word alone cannot sufficiently predict coherence and
therefore understanding of a text.
| Low Cohesion |
High Cohesion |
| Text |
FRE |
FKG |
The streets were wet. It had
rained.
|
100 |
0.0 |
| One part of the cloud develops a downdraft. Rain begins to fall.
|
80.8 |
3.4 |
| Among Glaswegians the precipitation caused havoc and
vexation. |
8.3 |
12 |
| Since John always jogs a mile and a half seems a short distance
to him. |
95.7 |
3.6 | |
| Text |
FRE |
FKG |
| The streets were wet because it had
rained. |
100 |
0.8 |
| One part of the cloud develops a downdraft, which causes rain to
fall. |
83 |
4.9 |
The rainfall caused devastation and irritation among citizens of
Glasgow.
|
9.7 |
12 |
| Since John always jogs, a mile and a half seems a short distance
to him. |
95.7 |
3.6 | |
|
|
Although it must be noted that a text should generally have more than 200
words before the Flesch Reading Ease and Flesch-Kincaid Grade Level scores
can successfully be applied, the conclusion is the same. To measure readability,
coherence and comprehensiveness of a text, more than surface features need
to be taken in consideration than surface features alone. Quantitative and
qualitative factors like the number of anaphora, number of overlapping text
segment, vocabulary difficulty, sentence and text structure, concreteness
and abstractness, are equally needed. It is the sum of these and other factors
that constitutes cohesion.