----------
Scientific error: It happens, but how do we detect it?
----------
Lots of things can go wrong in a psychology experiment--participants can fail to pay attention, attempts to manipulate participants can fail, measurements can be unreliable or invalid, etc. Problems like these can lead to distorted data and incorrect conclusions.
So, when you have obtained your data, how can you be confident that it is informative, and not contaminated by some problem or error that occured while collecting the data? Once common answer is to examine the data to see if the hypothesis was confirmed: if so, there must not have been any major problems; if not, there may have been problems. This sounds appealing, but it is a very, very problematic approach to science--it means we will only look skeptically at our results if we don't like the outcome, but will turn off critical thinking when results are as expected. Clearly, this is a form of confirmation bias! Moreover, problems can distort our data in **any direction**--inflating small effects or diminishing large effects.
----------
The glorious positive control
----------
What we need is a way of detecting researcher error that is **independent** of if the data collected to test the research hypotheses. Enter the **Positive Control**--a positive control is an additional study or condition that is run in which the correct result is very well known. If we see the expected result for the positive control, it provides us with some assurance that the experiment was conducted properly, without very serious error. If, however, we do not see the expected result, it could indicate a problem in the experiment.
Positive controls are very common in the biological sciences. For example, when sequencing an unknown gene, a biologist will typically **also** sequence a known gene. The known gene, in this case, is the positive control. If the data for the known gene is as expected (the sequence obtained matches the sequence already determined by others), the scientist can have greater condfidence that the sequence for the unknown gene is valid.
Note that positive controls, like anything else in science, are not a perfect gaurentee--it could be possible to make an error or mistake that has no effect on the positive control but which causes serious distortion for the main research question. Hopefully, it should be rare to make such a selective error. This reveals an important point, though: **The Positive Control should be as similar as possible to what is being studied for the research hypothesis**. When the positive control conditions are similar to those for the main research hypothesis it makes it more likely that any research error will affect **both** the data for the research hypothess and the data for the positive control.
----------
Positive controls in psychology?
----------
Biologists and chemists often use positive controls because they often have lots of "knowns"--lots of effects which are very well established and validated.
Does psychology have any knowns? Yes! There are many effects in psychology which are very well established and validated. For example, the [Stroop effect][1] is very, very robust in people who read. It has been replicated hundreds of times (MacLeod, 1991) and is so reliable that when a participant does not show a Stroop effect it can be a sign of a neurological problem (Golden, 1978). That's an ideal positive control: if you try finding a Stroop effect but fail, chances are that something has gone seriously wrong with your research or your participants.
----------
Choosing a positive control is difficult
----------
Even though psychology does have a lot of well-known effects (like the Stroop effect), choosing a good positive control can be difficult. Remember, that you'd like your positive control to be similar to what you are studying. In what ways?
* Effect size -- your positive control should be similar in effect size to your main topic of study. In particular, the positive control should not be very big relative to the effect of interest--finding a very big effect does not provide re-assurance that your study was properly conducted to detect a small effect.
* Domain -- ideally, your positive control should be in the same domain as your research hypothesis (e.g. a cognitive task as a positive control for a research hypothesis in cognitive psychology, a social psych task as a positive control for a research hypothesis in social psychology, etc.)
* Materials/Presentation -- ideally your positive control should be similar in format and presentation to your main study. For example, if your main study is done via a computerized interface, it would be best to have a positive control that also is completed via a computerized interface.
Ok - so a positive control should be a good match for your study. Is that all? Unfortunately, no: there are some additional important considerations.
* Novel - a good positive control should be new to the participants. You can't re-use the same positive control over and over again with the same participants or they will become insensitive to it.
* Brief - you will run the positive control immediately after the main study. It will be helpful if it is a short study that will not increase the duration of the study by much.
* Cheap/Easy - always lovely to have a study that is simple and cheap to administrate!
In addition, here are some other features which can be nice to have in a positive control (though not strictly necessary):
* Quantitative - regardless of effect size it is nice to have a positive control based on a quantitative measure (interval or ratio measure). Qualitative measures are fine, but are lower powered and also provide little indication of what went wrong if the positive control did not come out as planned.
* Screenable - If you can identify participants who are outliers and/or not serious about the positive control you can use this as an exclusion criterion for your main study. This can be a big help in reducing within-group variance, similar to an Instructional Manipulation Check (don't know what that is? Read this amazing paper by [Oppenheimer et al. (2009)][2] )
----------
Finding positive controls
----------
How are you going to find a perfect positive control? Well, you probably won't be able to find one that is perfect in every way, but with some digging you can often identify a positive control that is useful. Here are some good sources:
* [The Many Labs projects][3] -- these are large-scale collaborative replication efforts to test some classic psychology effects in labs all over the world. They are a treasure trove of possible positive controls--each with an effect size very precisely estimated from thousands of participants. Even better, the Many Labs projects are all Open Science, so you can find the materials to adopt. You can also see the variability in effect size across labs to determine if the positive control is likely to be stable and reliable for your studies. Gold! Check the [Ready Materials][4] page for links to each Many Labs project.
* Registered Replication Reports - similar to the Many Labs projects but focusing on one effect at a time. The main problem here is that so far many of the tested effect sizes end up being near 0, which is not very useful as a positive control.
----------
Some examples
----------
[Moery & Calin-Jageman (2015)][5] tried to replicate a study [(Eskine, 2013)][6] showing that organic food exposure has large effects on moral behavior.
As a positive control they used the [Retrospective Gambler's Fallacy][7] (Oppenheimer & Monin, 2009). This is a very reliable effect where participants are asked to imagine entering a casino where they see a gambler rolling dice. The gambler rolls either a) 3 sixes (the all-sixes condition) or b) 2 sixes and a three (the some-sixes condition). Participants are then asked to estimate how long the gambler has been rolling dice. Oppenheimer & Monin (2009) found that participants who hear the all-sixes scenario estimate substantially more rolls. The Many Labs 1 project replicated this effect and found it to be a robust and moderately lare difference (Cohen's d = 0.59 across all replication sites).
In Moery & Calin-Jageman (2015) the Retrospective Gambler's Fallacy was added to the end of the replication protocol, with participants randomly assigned to the some-sixes or all-sixes condition indpendent of their groups assignment in the main study. In each replication, Moery & Calin-Jageman (2105) found more rolls estimated in the all-sixes condition than in the some-sixes condition. They did find, however, slightly lower effect sizes than the Many Labs project.
----------
References
----------
Eskine, K. J. (2013). Wholesome Foods and Wholesome Morals?: Organic Foods Reduce Prosocial Behavior and Harshen Moral Judgments. Social Psychological and Personality Science, 4(2), 251–254. doi:10.1177/1948550612447114
Golden, CJ (1978). Stroop Color and Word Test: A Manual for Clinical and Experimental Uses. Chicago, Illinois: Skoelting. pp. 1–32, Reference via Wikipedia.
MacLeod CM (March 1991). "Half a century of research on the Stroop effect: an integrative review". Psychological Bulletin 109 (2): 163–203. doi:10.1037/0033-2909.109.2.163. PMID 2034749, Reference via Wikipedia.
Moery, E., & Calin-Jageman, R. J. (2016). Direct and Conceptual Replications of Eskine (2013): Organic Food Exposure Has Little to No Effect on Moral Judgments and Prosocial Behavior. Social Psychological and Personality Science, 7(4), 312–319. doi:10.1177/1948550616639649
Oppenheimer, D. M., Meyvis, T., & Davidenko, N. (2009). Instructional manipulation checks: Detecting satisficing to increase statistical power. Journal of Experimental Social Psychology, 45(4), 867–872. doi:10.1016/j.jesp.2009.03.009
Oppenheimer, D. M., & Monin, B. (2009). The retrospective gambler’s fallacy: Unlikely events, constructing the past, and multiple universes. Judgment and Decision Making, 4(5), 326–334.
[1]: https://en.wikipedia.org/wiki/Stroop_effect
[2]: http://linkinghub.elsevier.com/retrieve/pii/S0022103109000766
[3]: https://osf.io/wx7ck/
[4]: https://osf.io/jx2td/wiki/Ready%20Materials/
[5]: http://spp.sagepub.com/cgi/doi/10.1177/1948550616639649
[6]: http://spp.sagepub.com/cgi/doi/10.1177/1948550612447114
[7]: http://www.decisionsciencenews.com/sjdm/journal.sjdm.org/9609/jdm9609.pdf