**Abstract:** Across the effects in Many Labs 1 and 3, I find that p-curve, replication
index, test of insufficient variance and average sample size do not predict
replication outcomes. These results suggest caution in using paper-level
metrics to infer the evidential value of individual effects.
**Note:** [Please read this important blog post for additional detail on methods, data, and analyses that qualifies the conclusions of the poster.](http://www.ibm.com/ibm/responsibility/initiatives/IBMSocialGoodFellowship.html)
**Poster Session E:** Friday at 12:00p-1:30p, Poster 258.
[Data, analysis code, and a pre-registration are available from GitHub.](https://github.com/ecsalomon/TSR---Test-Stats-Replication)