Mechanical Turk, an online crowdsourcing platform, has received increased
attention among psychologists as a potentially reliable source of
experimental data. Given the ease with which participants can be quickly
and inexpensively recruited, it is worth examining whether Mechanical Turk
can provide accurate data for analyses that require large samples. One
such type
of analysis is Item Response Theory, a psychometric paradigm that defines
test items by a mathematical relationship between a respondent’s ability
and the probability of item endorsement. To test whether Mechanical Turk
can serve as a reliable source of data for Item Response Theory modeling,
researchers administered a verbal reasoning scale to Mechanical Turk
workers and compared the resulting Item Response Theory model to that of an
existing normative sample. While Item Characteristic Curves did
significantly differ, both models had high agreement on the fit of
participants’ response patterns and on participant ability estimation. Such
findings lend support to the use of Mechanical Turk for research purposes
and suggest its use for quick, inexpensive Item Response Theory modeling.