Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
Method Participants Samples 1-3. We recruited participants for all three samples from a medium-sized Western Canadian university in exchange for course credit in introductory psychology classes. Samples sizes were 130, 109, and 119, respectively. The mean ages for these samples were 20.48 (SD=4.39), 20.91 (SD=5.26), and 19.85 (SD=3.70), respectively. They were mostly women: Sample 1 = 70.00% with 2 responders undisclosed, Sample 2 = 68.81% with 1 responder undisclosed, and Sample 3 = 64.71% with 2 responders undisclosed. Measures and Validity Scale Indexes NEO-Five-Factor Inventory-3 (NEO-FFI-3; McCrae & Costa, 2010). The NEO-FFI-3 is a 60-item measure of the five-factor personality model. Its items are equally divided among 5 subscales called Neuroticism, Extraversion, Openness to Experience, Agreeableness, and Conscientiousness. The inventory consists of 35 positively word items and 25 negatively worded items across the subscales. Answered on a 5-point Likert scale ranging from 1=Strongly Disagree to 5=Strongly Agree, higher scores reflect higher trait levels on each dimension. It has excellent psychometric properties and takes less than 10 minutes to complete. A sample item from the Extraversion subscale is, ‘‘I laugh easily.’’ Conscientious Responders Scale (CRS; Marjanovic et al., 2014; described in detail in the Introduction). After transcribing all 60 NEO-FFI-3 items into a word processor document, we randomly embedded the CRS’ items in the questionnaire from beginning to end. With a sum score range of 0 to 5, we labelled all high CRS scorers (3-5) as “Conscientious Responders” and all low scorers (0-5) as “Indiscriminate Responders.” An example item is, “Please answer this item by choosing option two, Disagree.” Inter-Item Standard Deviation (ISD; Marjanovic et al., 2015; described in detail in the Introduction). The ISD is a measure of within-subjects inter-item response consistency calculated across all of the items of a single-construct measure. Because we assume CRs to respond consistently to all the items of a reliable measure of some construct (one with a moderate-to-strong mean inter-item correlation or Cronbach’s alpha ≥.70), we expect CRs to produce small ISDs. Oppositely, because we expect IRs to respond with no care or attention to the semantic content of the items, we expect their response dispersion to be much higher, producing larger ISDs than CRs. Even-Odd Index (EOI; Jackson, 1977; Johnson, 2005). Akin to the theoretical underpinnings of the Spearman-Brown split-half reliability, Jackson posited that one-half of an individual’s questionnaire responses should be highly correlated to their responses on the other half of the questionnaire (Warrens, 2016). The EOI, which he original called Individual Reliability, is a within-subjects correlation across the two halves of an inventory. Odd-numbered items make up one-half of the inventory and even-numbered items make up the other half. Highly positive EOI coefficients indicate conscientious responding whereas low coefficients indicate the opposite. Because the number of item-pairs is only half the number of the total number of items in the inventory, the EOI is often corrected for its shortened length using the Spearman-Brown prophecy formula. Negative EOI scores are often truncated to -1.00 to minimize the influence of extreme negative values (Maniaci & Rogge, 2014). We followed these standards here. (Psychometric) Inconsistency Scale (INC). INCs make up some of the oldest and most effective tools developed for identifying responders (Curran, 2016; Huang et al., 2012). The idea is that a CR should answer similarly across item pairs that are highly positively correlated (synonyms) and dissimilarly across item pairs that a highly negatively correlated (antonyms). Because IRs answer items unsystematically across items pairs, their responses are neither similar nor dissimilar, respectively. They produce much different INC scores than CRs. Using the INC standards put forth by Schinka, Kinder, and Kremer (1997) and Scandell (2000), who previously developed INC scales for the 240-item NEO-Personality Inventory-Revised and 60-item NEO-FFI, respectively, we developed INCs separately in all three of our samples (Table 1). We started by choosing 10 pairs of the highest correlating, non-overlapping items in the inventory (i.e., no single item appears in more than one item pair). We selected the two highest correlation item-pairs per dimension. Item-pair correlations ranged from .44 - .77 in Sample 1, .47 - .66 in Sample 2, and .41 - .68 in Sample 3. All of our item-pair correlations exceeded our selection criteria, coefficients ≥ .40 (Scandell, 2000; Schinka et al., 1997). To score each INC item-pair, we calculated the absolute difference of each item pair’s item 1 response and item 2 response. Items answered consistently (e.g., item 1 = 3 & item 2 = 3) yielded absolute difference scores of 0. Items answered maximally inconsistently (e.g., item 1 = 1 & item 2 = 5) yielded scores of 4. Item-pair absolute differences were then summed to create an INC total score, which ranged from 0 to 40. Higher scores indicate greater response inconsistency or IR. Mahalanobis Distance (MD). MD is statistical approach to identifying multivariate outliers (Curran, 2016; Huang et al., 2012). Calculated across two or more variables, MD quantifies the differences between the mean of each responders’ scale scores on a series of measures, called the centroid, against the centroid of the same variables for the entire sample. The more that a responders’ centroid is close to the sample centroid, the more we can say the responder is representative of the sample. It is important to note that its effectiveness at differentiating responders is best in groups of measures with large mean-midpoint differences (MMDs) and worst in measures with small MMDs (Curran, 2018; Meade & Craig; 2012). Because IRs consistently produce mean scores at or near the midpoint of a response scale, their centroids (and MMDs) are small and more easily differentiated from CRs with large MMDs (and centroids) than CRs with small MMDs and centroid. In there data we expect CRs to produce larger MDs than the IRs. Long-String Index (LSI). The long-string index is a count of how many times a responder selects the same response option in a row (Curran, 2016; Huang et al., 2012). Intuitively, the longer the string of the same response, especially to negatively worded (reverse-scored) items or items measuring different constructs, the more likely it is that the responder is answering indiscriminately. With the NEO-FFI-3, it makes sense to calculate the LSI to identify responders because it consists of many negatively worded items and those items appear in an alternating order measuring different dimensions form one item to the next. As the Big 5 dimensions are theoretically orthogonal in nature, there is no reason to expect a systematic response pattern emerge across its alternating items. Typically, we interpret LSI scores such that longer strings of the same response are indicative of IRs. Because we generated our IR data with a random-number generator, we expect very few responses to be identical from one to the next. In fact, we expect many more identical sequential responses in the CR data because research shows CRs produce long-string means greater than 2.00 (Desimone, Harms, & Desimone, 2015; Ward & Meade, 2018). We therefore, contrary to its normal usage, expect our CRs to produce bigger LSI scores than IRs. Procedure and Calculation of Cut-Off Scores We administered paper-and-pencil questionnaires as part of three distinct personality studies between 2014 and 2016. Each questionnaire took less than 30 minutes to complete. The questionnaires were administered in a large-classroom setting and each proctored by a research assistant. Upon completion of the questionnaires, we thanked participants for their contributions and gave them debriefing statements before leaving. We generated cut-off scores for all validity indexes except the CRS using the logistic regression approach (see also Marjanovic et al., 2015). With the logistic regression approach, we let the regression analysis empirically generate the best possible cut-off scores that would maximize rates of sensitivity and specificity (i.e., classification accuracy). There are other statistical methods of achieving this end (e.g., ROC analysis), but we prefer the logistic approach because it generates predictive models that can be used to accurately identify responder types in future data sets (Marjanovic, 2010). We prefer the logistic approach to all outlier approaches (e.g., flagging responders with scores ≥ or ≤ 2 SDs of some mean score) because they too often produce intolerably low rates of sensitivity or specificity (Curran 2016; Huang et al., 2012; Mead & Craig, 2012; Niessen et al. 2016). We delineate the process we used to generate our results in the Appendix. Appendix Five-Steps to Differentiating Responders in Existing Questionnaire Data For the sake of example, let us assume we have a 100-item questionnaire consisting of five 20-item measures answered on a 7-point scale. 100 people completed the questionnaire. (1) Generate uniform random data for all of the items in the questionnaire for as many individuals as completed it. Here we would generate random data for all 100 items and for all 100 cases, finishing with a sample size of 200. It is best to select a number generator that produces whole numbers (e.g., Haahr, 2018). (2) Create a variable at the beginning of your data set called Responder Group. Label the original sample as CRs or 0s and all of the generated random data as IRs or 1s. Be sure to reverse score any reverse-scored items. The ISD works best when all items correlate strongly in the same direction. (3) Calculate the ISD for each responder in the sample, one for each of the 5 measures of the questionnaire. Using most statistical software, calculate the ISD as if it were an intra-individual standard deviation for a within-groups study design. (4) Aggregate the ISD by computing the mean of the five single-scale ISDs. We call this variable M-ISD. (5) Perform a binary logistic regression analysis using M-ISD as the predictor variable and Responder Group as the criterion. Responders with classification probabilities < .50 get labelled as CRs and responders with probabilities ≥ .50 get labelled as IRs. In previous research, validity scale researchers have often used classification accuracies ≥ 80% as a type of “gold standard” to differentiate good from bad validity-index performances (Clark et al., 2003). For our purposes, we adopted the same ≥ 80% criterion.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.