| Last Updated:
Creating DOI. Please wait...
Careless responding is considered a bias in survey responses without regard to the actual item content which constitutes a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (e.g., bogus items), auxiliary or paradata (e.g., response times), or data-driven statistical techniques (e.g., Mahalanobis distance). In the present study, gradient boosted trees, a state-of-the art machine learning technique, are introduced to identify carleess responders. The performance of the approach was compared to established techniques previously described in the literature (e.g., statistical outlier methods, consistency analyses, and response pattern functions) using simulated data and empirical data from a web-based study, in which diligent vs. careless response behavior were induced. The comparison between the results of the simulation and the online study showed that simulations that rely on prototypical pattern of careless responses tend to overestimate the classification accuracy. Gradient boosted trees outperform traditional detection mechanisms in flagging aberrant responses, especially by including response times as paradata, but are not to be misunderstood as a panacea of data cleaning. We critically discuss the results with regard to their generalizability and provide recommendations for the detection of aberrant response patterns in survey research.