To suggest additional articles for this list, please go to the project main page and insert the citation as a comment (see the blue speech-bubble tab in the upper right corner). We will periodically add suggested papers to the main list, below.
I. RESEARCH METHODS
Experimental Design:
Collins, L.M., Dziak, J.J., & Li, R. (2009). Design of experiments with multiple independent variables: a resource management perspective on complete and reduced factorial designs. Psychological methods, 14(3), 202. Fulltext
Collins, L. M., Baker, T. B., Mermelstein, R. J., Piper, M. E., Jorenby, D. E., Smith, S. S., ... & Fiore, M. C. (2011). The multiphase optimization strategy for engineering effective tobacco use interventions. Annals of behavioral medicine,41(2), 208-226. Fulltext
Kover, S.T., & Atwood, A.K. (2013). Establishing equivalence: Methodological progress in group-matching design and analysis. American Journal of Intellectual and Developmental Disabilities, 118, 3-15. Link
Mervis, C.B., & Klein-Tasman, B.P. (2004). Methodological issues in group-matching designs: α levels for control variable comparisons and measurement characteristics of control and target variables. Journal of Autism and Developmental Disorders, 34, 7-17. Fulltext
General research methods:
Martin, J. (1980). A garbage can model of the research process. In J. E. McGrath, J. Martin, & R. A. Kulka, Judgment calls in research (pp. 17–39). Beverly Hills, CA. Link
Multilevel and/or Longitudinal Design:
Duncan, S. C., Duncan, T. E., & Hops, H. (1996). Analysis of longitudinal data within accelerated longitudinal designs. Psychological Methods, 1(3), 236. Link
Nye, B., Konstantopoulos, S., & Hedges, L. V. (2004). How large are teacher effects?. Educational evaluation and policy analysis, 26(3), 237-257. Fulltext
Philosophy of Science:
Mayo, D.G., & Spanos, A. (2006). Severe testing as a basic concept in a Neyman-Pearson philosophy of induction. British Journal of the Philosophy of Science, 57, 323-357. Fulltext
Power and Sample Size:
Bakker, M., van Dijk, A., & Wicherts, J.M. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7, 543-554. Fulltext
Button, K.S., Ioannidis, J.P.A., Mokrysz, C., Nosek, B.A., Flint, J., Robinson, E.S.J., & Munafo, M.R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14, 365-376. Fulltext
Killip, S., Mahfoud, Z., & Pearce, K. (2004). What is an intracluster correlation coefficient? Crucial concepts for primary care researchers. The Annals of Family Medicine, 2(3), 204-208. Fulltext
Maas, C. J., & Hox, J. J. (2005). Sufficient sample sizes for multilevel modeling.Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 1(3), 86. Fulltext
Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling,9(4), 599-620. Fulltext
Spybrook, J., Raudenbush, S. W., Liu, X. F., Congdon, R., & Martínez, A. (2006). Optimal design for longitudinal and multilevel research: Documentation for the “Optimal Design” software. Survey Research Center of the Institute of Social Research at University of Michigan. Fulltext
Qualitative Methods:
Todd, Z., Nerlich, B., McKeown, S., & Clarke, D. D. (2004). Mixing Methods in Psychology: The Integration of Qualitative and Quantitative Methods in Theory and Practice. Psychology Press. Link
Replicable Science and Questionable Research Practices:
Brown, S. D., Furrow, D., Hill, D. F., Gable, J. C., Porter, L. P., & Jacobs, W. J. (2014). A Duty to Describe Better the Devil You Know Than the Devil You Don’t. Perspectives on Psychological Science, 9, 626-640. Link
Ellemers, N. (2013). Connecting the dots: Mobilizing theory to reveal the big picture in social psychology (and why we should do this). European Journal of Social Psychology, 43, 1-8. Link
Fuchs, H.M., Mirjam, J., & Fiedler, S. (2012). Psychologists are open to change, yet wary of rules. Perspectives on Psychological Science, 7, 639-642. Fulltext
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23, 524-532. Fulltext
Self- and Informant Report Methods:
Bartoshuk, L.M., Fast, K., & Snyder, D.J. (2005). Differences in our sensory worlds: Invalid comparisons with labeled scales. Current Directions in Psychological Science, 14, 122-125. Link
Vazire, S. (2006). Informant reports: A cheap, fast, and easy method for personality assessment. Journal of Research in Personality, 40, 472-481. Fulltext
Validity:
Brewer, M. B. (2000). Research design and issues of validity. Handbook of research methods in social and personality psychology, 3-16. Fulltext
Loevinger, J. (1957). Objective tests as instruments of psychological theory: Monograph Supplement 9. Psychological Reports, 3, 635-694. Link
II. DATA ANALYSIS
Uses and Misuses of Statistics:
Scarr, S. (1997). Rules of evidence: A larger context for the statistical debate. Psychological Science, 8, 16-17. Fulltext
Savalei, V., & Dunn, E. (2015). Is the call to abandon p-values the red herring of the replicability crisis?. Frontiers in Psychology, 6:245. Fulltext
Applied Problems:
Cramer, A. O., Sluis, S., Noordhof, A., Wichers, M., Geschwind, N., Aggen, S. H., ... & Borsboom, D. (2012). Dimensions of normal personality as networks in search of equilibrium: You can't like parties if you don't like people. European Journal of Personality, 26(4), 414-431. Fulltext
Hyde, J. S. (1994). Can meta-analysis make feminist transformations in psychology?. Psychology of Women Quarterly, 18, 451-462. Link
van de Leemput, I. A., Wichers, M., Cramer, A. O., Borsboom, D., Tuerlinckx, F., Kuppens, P., ... & Scheffer, M. (2014). Critical slowing down as early warning for the onset and termination of depression. Proceedings of the National Academy of Sciences, 111(1), 87-92. Fulltext
Vazire, S., & Gosling, S. D. (2004). e-Perceptions: personality impressions based on personal websites. Journal of personality and social psychology, 87(1), 123. Fulltext
Biological Psychology (neuro, geno):
Aarts, E., Verhage, M., Veenvliet, J. V., Dolan, C. V., & van der Sluis, S. (2014). A solution to dependency: using multilevel analysis to accommodate nested data. Nature neuroscience, 17(4), 491-496. Link
Allen, E. A., Erhardt, E. B., & Calhoun, V. D. (2012). Data visualization in the neurosciences: overcoming the curse of dimensionality. Neuron, 74(4), 603-608. Fulltext
Bassett, D. S., & Bullmore, E. D. (2006). Small-world brain networks. The neuroscientist, 12(6), 512-523. Fulltext
Erez, Y., Tischler, H., Moran, A., & Bar-Gad, I. (2010). Generalized framework for stimulus artifact removal. Journal of neuroscience methods, 191(1), 45-59. Fulltext
Franić, S., Dolan, C. V., Borsboom, D., Hudziak, J. J., van Beijsterveldt, C. E., & Boomsma, D. I. (2013). Can genetics help psychometrics? Improving dimensionality assessment through genetic factor modeling. Psychological methods, 18(3), 406. Fulltext
Logan, J. A., Petrill, S. A., Hart, S. A., Schatschneider, C., Thompson, L. A., Deater-Deckard, K., ... & Bartlett, C. (2012). Heritability across the distribution: An application of quantile regression. Behavior genetics, 42(2), 256-267. Fulltext
Medland, S. E., Neale, M. C., Eaves, L. J., & Neale, B. M. (2009). A note on the parameterization of Purcell’s G× E model for ordinal and binary data. Behavior genetics, 39(2), 220-229. Fulltext
Mills, K. L., & Tamnes, C. K. (2014). Methods and considerations for longitudinal structural brain imaging analysis across development. Developmental cognitive neuroscience, 9, 172-190. Link
Mumford, J. A. (2012). A power calculation guide for fMRI studies. Social cognitive and affective neuroscience, 7(6), 738-742. Fulltext
Mumford, J. A., & Poldrack, R. A. (2007). Modeling group fMRI data. Social cognitive and affective neuroscience, 2(3), 251-257. Fulltext
Yendiki, A., Koldewyn, K., Kakunoori, S., Kanwisher, N., & Fischl, B. (2013). Spurious group differences due to head motion in a diffusion MRI study. NeuroImage, 88, 79–90. Fulltext
Confidence Intervals:
Belia, S., Fidler, F., Williams, J., & Cumming, G. (2005). Researchers misunderstand confidence intervals and standard error bars. Psychological Methods, 10, 389-396. Fulltext
Fidler, F., & Loftus, G.R. (2009). Why figures with error bars should replace p values. Journal of Psychology, 217, 27-37. Fulltext
Dyadic data analysis:
Kashy, D. A., & Kenny, D. A. (2000). The analysis of data from dyads and groups. In H.T. Reis & C.M. Judd (Eds.), Handbook of research methods in social psychology (pp. 451-477). New York: Cambridge University Press. Link
Effect Size:
Chinn, S. (2000). A simple method for converting an odds ratio to effect size for use in meta-analysis. Statistics in medicine, 19(22), 3127-3131. Fulltext
Hill, C.J., Bloom, H.S., Black, A.R., & Lipsey, M.W. (2008). Empirical benchmarks for interpreting effect sizes in research. Child Development Perspectives, 2(3), 172-177. Fulltext
Latent Change Score Modeling:
Quinn, J. M., Wagner, R. K., Petscher, Y., & Lopez, D. (2014). Developmental Relations Between Vocabulary Knowledge and Reading Comprehension: A Latent Change Score Modeling Study. Child development.
Latent Class Analysis:
Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural equation modeling, 14(4), 535-569. Fulltext
Logistic Models:
Azen, R., & Traxel, N. (2009). Using dominance analysis to determine predictor importance in logistic regression. Journal of Educational and Behavioral Statistics, 34(3), 319-347. Fulltext
Chinn, S. (2000). A simple method for converting an odds ratio to effect size for use in meta-analysis. Statistics in medicine, 19(22), 3127-3131. Fulltext
O'Connell, A. A. (2006). Logistic regression models for ordinal response variables (Vol. 146). Thousand Oaks, California:: Sage Publications. Link
O'Connell, A. A., & McCoach, D. B. (Eds.). (2008). Multilevel modeling of educational data. IAP. Link
Peng, C. Y. J., Lee, K. L., & Ingersoll, G. M. (2002). An introduction to logistic regression analysis and reporting. The Journal of Educational Research, 96(1), 3-14. Fulltext
Yelland, L. N., Salter, A. B., Ryan, P., & Laurence, C. O. (2011). Adjusted intraclass correlation coefficients for binary data: methods and estimates from a cluster-randomized trial in primary care. Clinical Trials, 8(1), 48-58. Link
Longitudinal Analysis:
Collins, L. M., & Sayer, A. G. (2001). New methods for the analysis of change. American Psychological Association. Link
Hamaker, E. L., Nesselroade, J. R., & Molenaar, P. C. (2007). The integrated trait–state model. Journal of Research in Personality, 41(2), 295-315. Fulltext
Rabe-Hesketh, S., & Skrondal, A. (2008). Multilevel and longitudinal modeling using Stata. STATA press. Link
Meta-Analysis
Chan, M.E., & Arvey, R.D. (2012). Meta-analysis and the development of knowledge. Perspectives on Psychological Science, 7, 79-92. Fulltext
Davis‐Kean, P. E., & Sandler, H. M. (2001). A meta‐analysis of measures of self‐esteem for young children: A framework for future measures. Child development, 72(3), 887-906. Fulltext
Eagly, A. H., & Wood, W. (1994). Using research syntheses to plan future research. In H. M. Cooper & L. V. Hedges (Eds.), The handbook of research synthesis (pp. 485-500). New York: Russell Sage Foundation. Link
Smith, M. L., & Glass, G. V. (1977). Meta-analysis of psychotherapy outcome studies. American psychologist, 32, 752. Fulltext
Tsuji, S., Bergmann, C., & Cristia, A. (2014). Community-Augmented Meta-Analyses Toward Cumulative Data Assessment. Perspectives on Psychological Science, 9, 661-665. Link
Wood, W., & Eagly, A. H. (2009). Advantages of certainty and uncertainty. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds)., The handbook of research synthesis and meta-analysis (pp. 455-472). New York: Russell Sage. Link
Moderation and Mediation:
Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. London: Sage.
Frazier, P.A., Tix, A.P., & Barron, K.E. (2004). Testing moderator and mediator effects in counseling psychology research. Journal of Counseling Psychology, 51, 115-134. Fulltext
Kraemer, H. C., Kiernan, M., Essex, M., & Kupfer, D. J. (2008). How and why criteria defining moderators and mediators differ between the Baron & Kenny and MacArthur approaches. Health Psychology, 27, S101-108. Fulltext
Iacobucci, D., Saldanha, N., & Deng, X. (2007). A meditation on mediation: Evidence that structural equation models perform better than regressions. Journal of Consumer Psychology, 17, 140-154. Fulltext
Ledgerwood, A., & Shrout, P. E. (2011). The tradeoff between accuracy and precision in latent variable models of mediation processes. Journal of Personality and Social Psychology, 101, 1174-1188. Link
Valeri, L., & VanderWeele, T. J. (2013). Mediation analysis allowing for exposure–mediator interactions and causal interpretation: Theoretical assumptions and implementation with SAS and SPSS macros. Psychological methods, 18(2), 137. Fulltext
Multilevel Modeling:
Krull, J. L., & MacKinnon, D. P. (1999). Multilevel mediation modeling in group-based intervention studies. Evaluation Review, 23(4), 418-444. Fulltext
Krull, J. L., & MacKinnon, D. P. (2001). Multilevel modeling of individual and group level mediated effects. Multivariate behavioral research, 36(2), 249-277. Fulltext
McCoach, D. B., & Kaniskan, B. (2010). Using time-varying covariates in multilevel growth models. Frontiers in psychology, 1, 17. Fulltext
Singer, J. D. (1998). Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individual growth models. Journal of educational and behavioral statistics, 23(4), 323-355. Fulltext
Yelland, L. N., Salter, A. B., Ryan, P., & Laurence, C. O. (2011). Adjusted intraclass correlation coefficients for binary data: methods and estimates from a cluster-randomized trial in primary care. Clinical Trials, 8(1), 48-58. Link
Multiple Regression:
Azen, R., & Budescu, D. V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological methods, 8(2), 129. Fulltext
Multivariate Statistics:
Tabachnik, B. G., & Fidell, L. S. (2012). Using multivariate statistics (6th ed.). Boston: Pearson.
Scale construction:
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychological assessment, 7, 309 - 319. Fulltext
Structural Equation Modeling:
Ding, L., Velicer, W. F., & Harlow, L. L. (1995). Effects of estimation methods, number of indicators per factor, and improper solutions on structural equation modeling fit indices. Structural Equation Modeling: A Multidisciplinary Journal,2(2), 119-143. Fulltext
Rhemtulla, M., Brosseau-Liard, P. É., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological methods, 17(3), 354. Fulltext
Schermelleh-Engel, K., Moosbrugger, H., & Müller, H. (2003). Evaluating the fit of structural equation models: Tests of significance and descriptive goodness-of-fit measures. Methods of psychological research online, 8(2), 23-74. Fulltext
Survival Analysis:
Singer, J. D., & Willett, J. B. (1993). It’s about time: Using discrete-time survival analysis to study duration and the timing of events. Journal of Educational and Behavioral Statistics, 18(2), 155-195. Fulltext
Test theory:
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. New York: Wadsworth.
Embretson, S. E., & Reise, S. P. (2013). Item response theory for psychologists. Psychology Press. Link
Publication culture
Ledgerwood, A., & Sherman, J.W. (2012). Short, sweet, and problematic? The rise of the short report in psychological science. Perspectives on Psychological Science, 7, 60-66. Link
Reporting Practices:
Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345, 1502-1505. Link
Franco, A., Simonovits, G. & Malhotra, N. (2015). Underreporting in political science survey experiments: Comparing questionnaires to published results. Political Analysis. Link
Kashy, D. A., Donnellan, M. B., Ackerman, R. A., & Russell, D. W. (2009). Reporting and interpreting research in PSPB: Practices, principles, and pragmatics. Personality and Social Psychology Bulletin, 35, 1131-1142. Fulltext
III. BLOGS ABOUT METHODS AND STATISTICS
Dorothy Bishop. BishopBlog.
Suzi Gage, Kate Button and others. Sifting the Evidence.
Åse Kvist Innes-Ker. Åse Fixes Science.
Deborah Mayo. Error Statistics Philosophy.
Sophie Scott. Speaking Out.
Bobbie Spellman. My Perspectives (on PsychScience)
Simine Vazire. sometimes i'm wrong.
Your browser should refresh shortly…
Renaming wiki...
Press Confirm to return to the project wiki home page.
This page is currently connected to the collaborative wiki. All edits made will be visible to contributors with write permission in real time. Changes will be stored but not published until you click the "Save" button.
This page is currently attempting to connect to the collaborative wiki. You may continue to make edits. Changes will not be saved until you press the "Save" button.
The collaborative wiki is currently unavailable. You may continue to make edits. Changes will not be saved until you press the "Save" button.
Your browser does not support collaborative editing. You may continue to make edits. Changes will not be saved until you press the "Save" button.
Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.
Copyright © 2011-2025
Center for Open Science
|
Terms of Use
|
Privacy Policy
|
Status
|
API
TOP Guidelines
|
Reproducibility Project: Psychology
|
Reproducibility Project: Cancer Biology