Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
## Introduction In good-enough and noisy-channel processing, the listener's interpretation can deviate from the input signal. Research on good-enough processing shows shallow processing heuristics are used in comprehension that typically work well, but can yield interpretation errors. Noisy-channel processing posits that because communication occurs over a noisy channel, listeners infer what the signal was from uncertain input. Both assume that the linguistic signal can lose fidelity during communication, and that communicators are not always perfect. What's an eggcorn? An eggcorn is a mis-perception and mis-acquisition of a word/phrase. The word "eggcorn" is an eggcorn derived from the source "acorn". Others include "firstable" for "first of all", and "for all intensive purposes" from "for all intents and purposes". Eggcorns approximate the sound of the input and are often good-enough in speech, but show up in writing. PSA: I'm using the term "source" throughout to refer to the original word or phrase an eggcorn derives from. An eggcorn that made the rounds on Twitter last March when COVID-19 lockdown started was the eggcorn "cornteen" for the source word "quarantine". To date, there has been little to no psycholinguistic research on eggcorns. I argue that eggcorns are naturalistic evidence for good-enough and noisy-channel processing. Understanding how eggcorns form can help us understand how language processing works more broadly, in the same way that studying speech errors has informed our understanding of production. We should take eggcorns seriously as linguistic data. In this study, we analyzed the properties of entries from The Eggcorn Database, for example "tenure" to "ten year", and "Alzheimer's" to "Old Timer’s”, as a preliminary step toward better understanding eggcorns. https://eggcorns.lascribe.net ## Methods We scraped hundreds of unique entries from the database and measured sound similarity between each source and eggcorn as the number of changes required to make the source and eggcorn IPA transcriptions match (Levenshtein distance). We also measured the change in the number of syllables between source and eggcorn, the change in frequency from source to eggcorn, and semantic similarity. We excluded misspellings, and entries that weren't present in all the databases. We predicted that eggcorns would sound very similar to source phrases, which means Levenshtein distance should be low because few changes are needed to make them sound the same, and that the number of syllables should not change. We also predicted eggcorns would be more frequent than source phrases, and that semantic similarity should be high because eggcorns should fit the context. ## Results As predicted, Levenshtein distance was low, and the number of syllables did not change for the majority of entries. Counter to our predictions, eggcorns were less frequent than source phrases, and I'll return to why that is in the discussion. And semantic similarity was low on the whole, though some eggcorn-source pairs were interchangeable. We analyzed Levenshtein distance as a count variable in an ordered probit regression with semantic similarity and frequency change as our independent variables. The regression showed higher semantic similarity predicted less sound similarity (higher Levenshtein distance), lower eggcorn frequency (negative frequency changes) predicted less sound similarity, and a marginal interaction. As predicted, eggcorns tend to strongly match the source phrase in sound. Semantic similarity was higher when sound differed more, which may suggest a trade-off between matching the sound and fitting the context of the sentence. ## Discussion The results suggest that the processes in question optimize for similarity to the source signal and fit with the surrounding context, occasionally trading between the two, which supports the idea that eggcorns arise from good-enough processing. Our interpretation is faithful to the input signal most of the time, but can deviate occasionally in principle ways. Understanding these deviations can inform our models of language processing. Some important limitations are that word/phrase frequency may not be well-suited to the question. We used it as a starting point because it required the fewest assumptions and was most conservative. But, many eggcorns are not attested in the databases we used. It's also possible that the frequency of morphemes or syllable transition probability may be more appropriate measures of frequency. We were also limited to the eggcorns in the database, there are likely many more that have not been added, and of course it's hard to draw conclusions about cognition from a corpus. To conclude, eggcorns likely reflect rational language processing, they may even be a mechanism of language change. For example, "spitting image" as in "John is the spitting image of Dale" traces back to "spit and image", and "spitten image". No one today would say "spitting image" is the incorrect usage, even though it used to be something else that was similar in sound, and the processes that give rise to eggcorns may have facilitated this change. By studying eggcorns we may learn more about how language processing works. ## Q&A Some great questions I got during my presentation (& my answers), simplified: **Q1:** What happens when we read eggcorns? Do we detect them as errors? - **A1:** We should detect them as errors unless we (the reader) also have that eggcorn. **Q2:** Is the eggcorn database updated frequently, and how representative is it of eggcorns in use? - **A2:** I don't think it's been updated in a while, and I don't know what the criteria are for adding an eggcorn. There are likely many more! **Q3:** Presumably kids acquire eggcorns, do they get corrected in a learning environment? - **A3:** Many of them probably do if detected from teachers or peers, or you might detect them on your own as literacy and vocabulary improve w/ schooling.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.