Over- and Underweighting of Extreme Values

doi:10.17605/OSF.IO/X83PK

Title	Authors

Home

# Experiment Code, Data, Analysis Code: Over- and Underweighting of Extreme Values in Decisions From Sequential Samples In this repository we supply the experiment code, raw data, and analysis code for the manuscript: > "Over- and Underweighting of Extreme Values in Decisions From Sequential Samples". Authors: - Verena Clarmann von Clarenau - Stefan Appelhoff - Thorsten Pachur - Bernhard Spitzer Links: - Preprint: https://doi.org/10.31234/osf.io/6yj4r - Journal article: *forthcoming* - Experiment code, data, analysis code: (this repository) https://doi.org/10.17605/osf.io/x83pk **Please read the contents of this file carefully.** The analysis code is placed at the root of this repository, and is intended to be run using MATLAB R2020b and the following toolboxes: - Statistics and Machine Learning Toolbox - Econometrics Toolbox - Optimization Toolbox - Parallel Computing Toolbox - The bayesFactor Toolbox (https://github.com/klabhub/bayesFactor, at commit `7fb8471ac95597d3439e28d999cf15be512e1b79`) - follow the installation as detailed in their documentation (this toolbox needs to be explicitly added to the MATLAB path) - this toolbox is also supplied as part of this repository, see the `external/` directory To reproduce the results of the study, the following scripts have to be run in order: 1. `produce_derivatives.m` (*optional*, run only if you want to reproduce the saved files in the `derivatives/` directory; takes a long time!) 2. `Figure1.m` 3. `Figure2.m` 4. `Figure3.m` 5. `Figure4_Figure5.m` 6. `produce_statistics.m` The remaining `*.m` files are functions that are being called from the above scripts. The `*.mat` files in the `derivatives/` directory contain saved data that can be reproduced by running the `produce_derivatives.m` script. For an overview of all `*.mat` files, see the section below. The `stats.txt` file in the `derivatives/` directory contains the statistics as reported in the manuscript. It can be reproduced by running the `produce_statistics.m` script. Figures of the manuscript are placed in the `figures/` directory when running the scripts. To reproduce Figure 1 of the manuscript, the already existing file `figures/Paradigm_Manuscript.pptx` can be opened via the Power Point software to export the first slide as `figures/Paradigm.tiff`, which is then used in `Figure1.m`. The following figures are to be considered supplemental (they are not included in the manuscript): - `Figure2_dual.pdf` -- Same as Figure 2 in the manuscript, but with bias and leakage parameters set to their mean estimates (averaged over range conditions: small/large) - `Figure2_single.pdf` -- see above but for the "single" task - `parameter_recovery_corr_matrix_no_mean.pdf` -- parameter recovery correlation matrices for each of the 8 experimental conditions - `parameter_recovery_corr_matrix_mean.pdf` -- see above, but averaged over distribution (uniform/Gaussian) and range (small/large) conditions, so 2 correlation matrices (mean single and mean dual) The experiment was run using Qualtrics (https://www.qualtrics.com), and the corresponding code is placed in the `experiment_code/` directory. There are 8 `.qsf` files (Qualtrics Survey Format), correponding to the 8 conditions of the experiment. See the `LICENSE` file for information on how the experiment and analysis code are licensed. Note that subcomponents of the analysis code may have a different license, see for example the license for the bayesFactor package in the `external/` directory, or the `cividis.m` file and its documentation. ## `derivatives/*.mat` files These files can be found in the `derivatives/` directory. - `gaussian_4param_nognorm.mat` - fit of `k`, `s`, `b`, and `l` parameters without gain normalization (Gaussian) - `gaussian_4param_nognorm_k1.mat` - fit of `s`, `b`, and `l` parameters with `k` fixed to 1 (Gaussian) - `gaussian_4param_rangegnorm.mat` - fit of `k`, `s`, `b`, and `l` parameters with gain normalization over the whole range of stimuli (Gaussian) - `gaussian_4param_trialgnorm.mat` - fit of `k`, `s`, `b`, and `l` parameters with trial level gain normalization (Gaussian) - `uniform_4param_nognorm.mat` - same as `gaussian_4param_nognorm` above but for uniform - `uniform_4param_nognorm_k1.mat` - same as `gaussian_4param_nognorm_k1` above but for uniform - `uniform_4param_rangegnorm.mat` - same as `gaussian_4param_rangegnorm` above but for uniform - `uniform_4param_trialgnorm.mat` - same as `gaussian_4param_trialgnorm` above but for uniform - `uniform_2param.mat` - fit of `k` and `s` with `b` and `l` each fixed to 0 (uniform only, this analysis was not done for Gaussian) - `uniform_3paramb.mat` - fit of `k`, `s`, and `b` with `l` fixed to 0 (uniform only, this analysis was not done for Gaussian) - `uniform_3paraml.mat` - fit of `k`, `s`, and `l` with `b` fixed to 0 (uniform only, this analysis was not done for Gaussian) - `uniform_4param_nognorm_biasoutside.mat` - same as `uniform_4param_nognorm` above, but the formula changed from `dv = sign(X+b) .* abs(X+b) .^k` to `dv = sign(X) .* abs(X) .^k + b` (uniform only, this analysis was not done for Gaussian) - `parameters.mat` - a combination of `uniform_4param_nognorm.mat` and `gaussian_4param_nognorm.mat` for convenience - `optimsim.mat`, `optimsim_b-dual_l-dual.mat`, `optimsim_b-single_l-single.mat` - saved values for Figure 2 ... the files with `_b-<stream>_l-<stream>` postfix, where `<stream>` may be `single` or `dual` are alternative values when bias and leak parameters where set to the empirical estimates from the single and dual stream tasks respectively - `optimsim_largerange.mat`, `optimsim_largerange_b-dual_l-dual.mat`, `optimsim_largerange_b-single_l-single.mat` - same as `optimsim.mat` but for larger ranges of the `k` and `s` parameters - `allparams_recovery.mat` - parameter recovery data See `produce_derivatives.m` for more information. # Data The data was collected via Prolific (https://www.prolific.co/) and Qualtrics (https://www.qualtrics.com/). Of the initially planned 100 participants for each of the 8 experimental conditions, a subset of participants were included (slightly less than 800 overall) after excluding participants for technical reasons (such as not signing the consent, or participating on a mobile device). The demographics of the overall sample of included participants is reported in the `data/demographics.csv` file. It contains the following columns: - `distribution` --> "uniform" or "gaussian", experiment factor - `stream` --> "single" or "dual", experiment factor - `range` --> "small" or "large", experiment factor - `xlsx_sheet` --> integer value which sheet in the corresponding `.xlsx` files (see below) this participant corresponds to. If empty (not available), then this participant is not present in the `.xlsx` files (most often, because attention checks were not passed) - `age` --> integer value with the age of the participant, may be empty (not available) - `sex` --> Either 1 for "male", or 0 for "female", or empty if data is not available or the participant selected "prefer not to say" In addition to the demographics of the overall sample of included participants, the raw experiment data is provided as a set of 8 Excel files (`.xlsx`). Note that raw experiment data is only provided for a subset of the overall sample from `data/demographics.csv` -- we do not provide the data for participants who (i) did not complete all trials (e.g., because attention checks were failed and the participant was excluded midway through the experiment), or (ii) accidentally participated in several experiment conditions (n=13, see manuscript for details). **All analyses reported in the manuscript exclusively use the `.xlsx` files provided here!** The 8 `.xlsx` files are structured as follows: - each file corresponds to one condition in the 2x2x2 design (=8 files in total) and are named `data/<distribution>_<task>_<range>.xlsx`: - distribution (gaussian vs. uniform) - task ("stream": single vs. dual) - range (small vs. large) - each "sheet" in the Excel file is the data for one participant - see the `xlsx_sheet` variable in `data/demographics.csv` to match a given participant with their demographics data - rows: trials - 250 in total per participant and condition - NOTE: in very few cases, the data for the last trial(s) may be missing due to server outages - column 1-8: numerical values of samples - in small range: 1, 2, ..., 9; in large range: 100, 200, ..., 900 - column 9-16: corresponding color of samples - 0=red, 1=blue - column 17: choice - in single-stream: 0=lower, 1=higher than 5; in dual-stream: 0=red, 1=blue had a higher average See the `data/LICENSE` file for information on how the data is licensed.

OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.

This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.

Create an Account Learn More Hide this message

Main content

Home

Menu

Start managing your projects on the OSF today.

Main content

Links to this project

Home

Menu

Add new wiki page

Page permissions have changed

Wiki page deleted

Connected to the collaborative wiki

Connecting to the collaborative wiki

Collaborative wiki is unavailable

Browser unsupported

Start managing your projects on the OSF today.