esci for R
==========
esci is now available as a package for R.
esci is Effect Sizes and Confidence Intervals, a package for the estimation approach to inferential statistics. esci emphasizes effect sizes, interval estimates, and meta-analysis. It was designed to make it simple to obtain estimates for common research designs and to produce beautiful, modern visualizations that focus on effect sizes and uncertainty.
Currently, esci can only do frequentist confidence intervals for estimates (no bootstrap intervals or Bayesian credible intervals, at least not yet).
This page explains how to install the current alpha version of the esci package for R and also gives some uninspired examples using the mtcars dataset built into base R.
If you use esci in R, your feedback and bug reports are more than welcome. Please post these to the GitHub page for the project: [https://github.com/rcalinjageman/esci][1]
**Be warned that the esci package is still in alpha**, meaning that some functions and results objects will still change, potentially in ways that break code you develop for the current version. Documentation is also minimal to non-existent. Although the state of the package is rough, basic analyses are working, and there should be steady improvement towards a stable and well-documented package throughout 2020.
The plots generates in esci are ggplot2 objects, so you once obtaining them you can apply any ggplot function/libraries to them.
Installing esci for R
---------------------
esci for R is not on CRAN (yet).
You can install the package directly from GitHub, but to do so you will need to have the devtools package installed first:
install.packages("devtools")
devtools::install_github("rcalinjageman/esci")
Estimate a mean
---------------------
esci can estimate a single mean and CI. This is the estimation replacement for a one-sample t or z-test. In this example we estimate the average miles per gallon (mpg) for the cars in the mtcars population.
library(esci)
estimate <- estimateMean(mtcars,
mpg,
conf.level = .95
)
estimate$summary_data
plotEstimatedMean(estimate)
Like other plot functions in esci, plotEstimatedMean will return a ggplot2 object. So you can apply any ggplot2 function or packages. For example (assuming you have ggplot2 installed):
myplot <- plotEstimatedMean(estimate)
myplot <- myplot + ggplot2::labs(title = "my title")
![enter image description here](https://files.osf.io/v1/resources/d89xg/providers/osfstorage/5eeebe5f659828013ecf2fd4?mode=render =30%x)
Estimate an independent mean difference
---------------------------------------
esci can estimate the difference between two independent means. This is the estimation replacement for an independent-samples t-test. In this example we estimate the difference in mpg for automatic vs. manual cars in the mtcars population.
library(esci)
data <- mtcars
data$am <- as.factor(data$am)
levels(data$am) <- c("automatic", "manual")
estimate <- estimateMeanDifference.default(data,
am,
mpg,
paired = FALSE,
var.equal = TRUE,
conf.level = .95,
reference.group = 1
)
estimate$summary_data
plotEstimatedDifference(estimate)
The estimateMeanDifference function expects grouping variable that is a factor. For this example, we use the am variable in mtcars, which encodes if the car has an automatic or manual transmission. In the mtcars data, this is iniitally coded as a numeric column with 0s and 1s. To change this to a factor, we first make a copy of mtcars (data <- mtcars). Then we redefine am as a factor. It is not required, but it also helps to recode the levels of am to real words/labels (automatic and manual) rather than 0s and 1s.
The reference.group parameter defines how the contrast between the means is set:
* reference.group = 1 sets the first level to be the reference group, so the contrast is: Mdiff = level2 - level1
* reference.group = 2 sets the second level to be the reference group, so the contrast is: Mdiff = level1 - level2
* the default is reference.group = 2
![esci - estimate independent mean difference](https://files.osf.io/v1/resources/d89xg/providers/osfstorage/5eeebf20659828013bcf2daa?mode=render =30%x)
This function can also be called using summary data rather than raw data:
estimate <- estimateMeanDifference.numeric(m1 = 12,
m2 = 10,
s1 = 2,
s2 = 2.1,
n1 = 10,
n2 = 20,
labels = c("Treated", "Control"),
paired = FALSE,
var.equal = TRUE,
conf.level = 0.95
)
estimate$summary_data
plotEstimatedDifference(estimate)
plotEstimatedDifference(estimate,
ylims = c(0, 20),
ylab = "My dependent variable",
xlab = "Treated vs. Control",
rope = c(-1, 1),
grouplabels = c("MyTreated", "MyControl"),
)
You define the means of the two groups (m1 and m2; m1 is the comparison group, m2 is the reference group), the standard deviations of the two groups (s1 and s2), and the sample sizes of the two groups (n1 and n2).
Optional parameters include group labels (a vector of two labels), the paired argument (which defaults to FALSE), if we should assume equal variance (defaults to TRUE), and a confidence level.
This returns an esci results object which can be plotted with plotEstimatedDifference. Some of the options for the plot are shown in the example above.
Estimate a paired mean difference
---------------------------------------
esci can estimate the difference between two paired means. This is the estimation replacement for a paired t-test. In this example we estimate the difference between the number of gears and carburetors in the cars in the mtcars population (this makes no sense, but trying to stick with a single built-in data set, so really reaching here).
library(esci)
data <- mtcars
estimate <- estimateMeanDifference.default(data, gear, carb,
paired = TRUE,
conf.level = .95,
reference.group = 2)
estimate$summary_data
plotEstimatedDifference(estimate)
As above, the reference.group parameter defines the way the contrast between means is constructed. You can omit this and a default of 2 will be used (the second level in the factor will be used as the reference group, so Mdiff = level1 - level2).
This function can also be called with summary data:
estimate <- estimateMeanDifference.numeric(m1 = 12,
m2 = 10,
s1 = 2,
s2 = 2.1,
n1 = 15,
r = 0.7,
paired = TRUE,
labels = c("Post", "Pre"),
conf.level = 0.95
)
estimate$summary_data
plotEstimatedDifference(estimate)
![esci - estimate paired mean difference](https://files.osf.io/v1/resources/d89xg/providers/osfstorage/5eeebf76145b1a01575338bc?mode=render =30%x)
Estimate a proportion
---------------------------------------
esci can estimate a proportion and its confidence interval. Here we will estimate the proportion of cars in the mtcars population with an automatic transmission:
library(esci)
data <- mtcars
data$am <- as.factor(data$am)
levels(data$am) <- c("automatic", "manual")
estimate <- estimateProportion.default(data, am,
case.level = "automatic"
)
estimate$summary_data
plotEstimatedProportion(estimate)
esci expects proportions to be stored as factors. In mtcars, the tyep of transmission is stored as an integer, so we convert it to a factor. We also label it to make things easier.
The case.level parameter is used to determine which level of the factor is plotted. The default is 1. You can specify the case level by ordinal number of the level or by name (enclosed in quotes). So, in the example above we specify that we want to know the proportion automatic. But we could obtain the proportion manual by passing case.level = 2 (because manual is the second level of the factor) or by passing case.level = "manual" (because that's the name of the level)
![esci - estimate proportion](https://files.osf.io/v1/resources/d89xg/providers/osfstorage/5eeebfb7659828013ecf3151?mode=render =30%x)
This function can also be called with summary data:
estimate <- estimateProportion.numeric(cases = 20,
n = 100,
caselabels = c("Depressed", "Not Depressed"),
conf.level = 0.95
)
estimate$summary_data
plotEstimatedProportion(estimate)
For summary data, you define the number of cases (outcomes of interest) against the total sample (n). It can be very helpful (but not required) to pass a vector of 2 labels defining the labels for cases and for non-cases. The conf.level argument is optional, defaulting to 0.95
Estimate a proportion difference
---------------------------------------
esci can estimate the difference between two independent dichotomies (nominal variables with only 2 mutall-exclusive levels). This is the estimation replacement for a 2x2 Chi Square test. In this example, we compare the proportion of automatic cars that are VS 1 to the proportion of manual cars that are VS1. I actually don't know what the vs variable means in the mtcars dataset.. but oh well:
library(esci)
data <- mtcars
data$am <- as.factor(data$am)
levels(data$am) <- c("automatic", "manual")
data$vs <- as.factor(data$vs)
levels(data$vs) <- c("zero", "one")
estimate <- estimateProportionDifference.default(data, am, vs,
case.level = "one",
group.level = "manual"
)
estimate$summary_data
plotEstimatedProportionDifference(estimate)
esci estimates the difference in proportions, which is often called the "Risk Difference". It does not yet output odds ratio or log odds ratio, but that's in the works.
The case.level and group.level params are optional. The case.level specifies which level of the outcome variable is to be used to calualte the proportion. In the example, we specific "one" as the level, so we're calculating proportion that have "one" as their vs status. We can specify the case level by ordinal number of the level or by name. So, we could specifiy proportion zero by with case.level = 1 or case.level = "zero" (because "zero" is the first level of the vs variable... damn this example is a bit confusing). If not passed, case.level defaults to level 1.
The group.level variable works the same way--it specifies the group level of interest by name or ordinal value of the level. If not passed, group.level defaults to level 1.
What happens if your outcome variable has more than 2 levels? esci will calculate proportion of case.level against all other levels (it will collapse to a factor with just 2 levels, the case.level vs. all others).
What happens if your grouping variable has more than 2 levels? esci will group by group.level vs. all others (it will collapse to just 2 levels: the group.level and not.group.level).
![esci - estimate proportion difference](https://osf.io/txfdp/download =30%x)
This function can also be used with summary data:
estimate <- estimateProportionDifference.numeric(cases1 = 20,
n1 = 100,
cases2 = 10,
n2 = 101,
caselabels = c("Depressed", "Not Depressed"),
grouplabels = c("Females", "Males"),
conf.level = 0.95
)
estimate$summary_data
plotEstimatedProportionDifference(estimate)
You must define cases1 with n1 (cases and total observations in the comparison group) and cases2 with n2 (cases and total observations in the reference group). It is optional to provide a vector of 2 case labels. It is also optional to provide a vector of 2 group labels. The conf.level argument is optional, defaulting to 0.95
Estimate a correlation
---------------------------------------
esci can estimate a linear correlation (Pearson's r) and its CI. This is the same correlation test, but the emphasis is not on ruling out a null hypothesis, but on thinking about the CI and what r values remain compatible with the data. In this example we'll correlate horsepower (hp) with miles per gallon (mpg).
library(esci)
estimate <- estimateCorrelation(mtcars, hp, mpg)
estimate
plotEstimatedCorrelation(estimate)
plotScatterPlot(estimate,
show.line = TRUE,
show.meanCI = TRUE,
show.PI = TRUE,
predictx = 200
)
plotEstimatedCorrelation shows the r value and its CI on a r scale (from -1 to 1).
plotScatterPlot makes a scatter plot, with the option to show the regression line, the CI on the regression line, prediction intervals (really usefuL), and the option to pass an x value to obtain and graph a predicted y value (pass predictx for the graph to show how the prediction is made)
![esci - estimate correlation](https://osf.io/by8z9/download =30%x)
![esci - scatterplot](https://osf.io/mcyns/download =30%x)
Estimate a 2x2 interaction
---------------------------------------
esci can produce estimates for a 2x2 between-subjects factorial design. I can't make this work with the mtcars data. So here is:
* A sample data file: [https://osf.io/6hv5s/][3]
* R project: [https://osf.io/4wjr2/][4]
* R script: [https://osf.io/3zxks/][5]
Download these into the same directory, open the R project file in R Studio and you'll be able to examine and run the example.
Meta-analysis
---------------------------------------
esci makes it easy to conduct a meta-analysis. It can currently handle meta-analyses for:
* Two group designs using raw data (means, standard deviations, sample sizes)
* Two group designs using the standardized mean difference (cohens' d, sample sizes)
* Correlations (r and sample size)
* Proportion differences (2-group contingency table)
[The files section][6] has examples for the two-groups raw and the two-proportions meta-analyses.
[1]: https://github.com/rcalinjageman/esci
[2]: https://files.osf.io/v1/resources/d89xg/providers/osfstorage/5eeebfb7659828013ecf3151?mode=render
[3]: https://osf.io/6hv5s/
[4]: https://osf.io/4wjr2/
[5]: https://osf.io/3zxks/
[6]: https://osf.io/d89xg/files/