Surprisal theory has provided a unifying framework for understanding many
phenomena in sentence processing (Hale, 2001; Levy, 2008), positing that a
word’s probability in context determines processing difficulty.
Problematically for this claim, one low-level statistic, word frequency,
has been shown to affect processing independently of surprisal. We present
the first clear evidence that a more complex low-level statistic, word
bigram probability, also affects processing independently of surprisal.
These findings suggest a broad, independent role of low-level statistics in
processing and motivate research into new generalizations of surprisal that
can also explain why local statistical information should have an outsized
effect.