# Chapter 3: Data Organization
@[toc]
# 3.1 Lab File-System
## 3.1.1 Experiment/Study File System
The lab has two storage spaces, which all lab computers should have access to:
- MudrikHub (should be on drive Z), containing all the data of the active experiments, and the common resources directory.
- MudrikLabArchives (should be on drive Y), containing inactive projects (divided into published projects, unpublished projects, and “Liad Projects” which are some remaining files from Liad’s postdoc), and some of the lab’s stimulus banks.
The spaces are only available from the university’s computers, and you need to request access to them. Lab managers will make sure you’re authorized, and then you can log in using your TAU credentials.
Your project’s folder should contain all your data and materials. Uploading your data to the stream should be done on a daily basis. **You do not want to lose precious data and then be in a situation where you have to run the experiment again. Of course, we encourage you to also back up your data locally on a hard drive: you can never have too much backup…**
There is a strict protocol on how to organize a project folder, explained in the next section.
While collecting data, since all experiments should be run locally (check the “running an experiment” chapter for further details), you need to create a small copy of your project's folder on the computer in the experimental room. Then, you should synchronize the collected data into your actual folder, which is in the MudrikHub.
## 3.1.2 File System Organization
Author: Rony Hirschhorn (hirschhorn@mail.tau.ac.il)
### 3.1.2.1 Experiment Folder Organization in the Lab
All of the lab’s running experiments and projects are saved in the “Experiments” folder in the MudrikHub.
Before starting to run a new experiment, open an experiment folder in “Experiments”. Let’s say that your experiment is named “Awesome”. (“/Experiments/Awesome”)
Your **first** step after opening a folder for your Awesome experiment, is to log it in the lab’s experiments metadata file, located in the main experiments folder (“/Experiments/experiments metadata”).
This excel sheet logs every single experiment that was run or is running in the lab (past or present). It looks something like this (example):
![enter image description here][1]
Year - the year in which the project began
Super project - the name of your project
In charge - your name
Email - your email
The structure of your project’s folder should adhere to the template that can be found at Common Resources/Experiment Templates.
After opening your experiment folder and updating the experiment metadata spreadsheet, go to your project’s folder, and create a document named “About”. This is a crucial step that would help us in the future understand which project this is. This document depicts the entire project’s goal, achievements, sub-experiments, general structures and processes, all the researchers involved, and when/where (if) it was published. Think of it like the back-cover of the story of your work. This is what’s needed for someone from the future to understand what you did and where everything is located, so they wouldn’t need to bother you using the email you specified. If you write this well enough, everything should already be there. Note that though this seems like redundant work, it is absolutely valuable for the lab to function properly (and for you as well; there may come a time when you won’t remember what you did and where everything is, so write write write!).
Copy the “About” file from the template, and fill it to fit your project.
### 3.1.2.2 Experiment Sub Folders Structure
Now, you need to organize all the materials in your experiment. To do that, you must copy the experiment folder template.
This is how it looks, when you open the template folder:
![enter image description here][2]
Both types contain the exact same general subfolders:
- *Analysis*: all the files related to any analysis or manipulation that was performed on the experiment data
- *Development*: all the files created while developing the experiment, for example, previous versions of the code, old stimuli.
- *Experiment*: everything one needs in order to run your experiment from A to Z
- *Paper*: all of the materials related to the experiment’s publishing efforts
- *Presentations*: all the materials from talks/posters of your experiment
- *Raw Data*: all of the experiment’s original raw subject data
We will now go over the content of the subfolders:
*paper*
![enter image description here][3]
If you are working on submitting your work to journals, this folder should contain all things related to your publishing efforts. This folder’s sub-directories are:
- OSF: for details about OSF, see 3.3.
- Folders for each journal to which you submitted your paper (hopefully, it would only be one journal and the paper is accepted. But realistically, there will be more). Within each folder, have a folder for every round (first submission, revision 1, revision 2 etc). Within each such subfolder, you should have the cover letter, the final draft (in word), the final submission in that round (PDF generated by the submission system of the journal), a folder with the figures for that round (saved in the jpg/bmp/png format, but also in an editable format in which it was created (e.g., eps, ai, psd, ppt...)) and a folder with all drafts for that submission.
For the subfolders you have per journal to which you submitted, please add the suffix _ACCETPED for the journal to which the paper was finally accepted. That way, we know that this is the most updated version of the final paper that was published. In that folder, you should also have another subfolder named “acceptance”, where you save the proofs and all forms you are asked to fill out upon acceptance. Also save the final PDF of the paper once it’s out.
*Presentations*
If you present your experiment in any forum, conference or talk, you should add it here:
Add your posters to :Presentations/Posters
(/Experiments/AG studies/Groovy/Presentations/Posters)
Add your slides to :Presentations/Talks
(/Experiments/AG studies/Groovy/Presentations/Talks)
*Experiment*
Whether you’re conducting a behavioral or EEG experiment, this sub directory must contain everything one must need in order to run your experiment from scratch.
This directory contains the following document:
Experiment setup (file): an excel table depicting the exact experiment technical parameters and conditions (distances, computers, software etc.)
![enter image description here][4]
This directory contains the following sub-folders:
- Experiment instructions: contains documentation of all the instructions given to subjects who participated in your experiment. The instructions should be written in a very clear and illustrative manner (you can use the lab notebook as your inspiration).
- RUN_ME: this folder is meant to be downloaded as-is, so that once it’s downloaded - the experiment can be executed from A to Z. This is also the folder that we will share with the scientific community once the paper is published, so anyone who wishes to do so, could reproduce our experiment.
For this to work, this folder must include:
- Code: a directory containing all the code files that are needed to run the experiment (note that your code should be written while already thinking about its future sharing, so make everything clean and nicely commented)
- Stimuli: a directory containing all the stimuli that’s needed to run the experiment
- README: a text file (word, whatever) in which you explain explicitly and step-by-step what the person who downloaded the RUN_ME folder should do to run your experiment.
Note that for that to work, you need to work by the organization protocol from the start - so that your code will run smoothly on this directory structure.
- *Running protocols*: this folder contains documentation of all the instructions that the experimenters running the experiment had to follow while running the experiment, as an SOP and a checklist (if there was more than one version - again all of them should be here, but it should be clear where and when each version was used)
- *Subject recruitment*: contains all the adds/messages you sent in order to get subjects sign-up for your experiment, including SONA explanations, facebook groups posts (images of cats), etc - This is an optional directory, if you want to preserve these items for your next experiments (which can save you some time later on).
*Raw Data:*
- Behavioral : a folder containing all the raw behavioral subject data. Even if your experiment is not behavioral (e.g., if this is an EEG experiment), we still care about behavior, so the behavioral data should be stored here.
- Eye Tracker: a folder containing all the raw eye-tracking data. Using the lab’s EyeLink eye-tracker, these are the original “.edf” files.
- EEG: a folder containing all the raw recording data - only the bdf files.
- Personality questionnaire: **If your experiment is in the field of unconscious processing, you should also conduct a personality questionnaire** (as depicted in the “[personality questionnaire][5]” section of the handbook). In this case, the the responses you collect for this questionnaire should be stored here
Of course, if there is an additional medium that produces a different format of raw data (imaging, skin-conductance sensors, etc), you should open a specific folder for this type of data and insert all you have in there.
*Analysis:*
- Analysis code: a folder containing all the different analyses codes and commands from all the relevant softwares that were in use during the experiment (matlab, python, R, SPSS, etc). Make sure your code has standard documentation (comments accompanying functions, important variables, etc), so that other people could go over it and understand what you did.
- Figures: a folder containing all the experiment’s figures as you plot them while analyzing the data (note that they will be further modified when you start writing your paper, and at this point they should be saved there, as explained above)
- Processed data: a folder containing the entire experiment data after the manipulations, cleanups, and pre-processing (in case such manipulation was made)
- Statistical analysis: a folder containing every analysis made on the data, and its results
- Analyzer: Relevant for EEG experiments only. This is the Analyzer folder you work with. It contains your workspace and relevant directories (History, Export, Templates; the Raw folder is actually the folder where you saved the raw data (see above); no need to save it twice), your personalized Analyzer protocols and your Analyzer log. You are expected to run your preprocessing locally (crucial for performance of the Analyzer program), and once you’ve finished (or preferably, periodically), sync this folder into your folder on the MudrikHub. Importantly, due to storage considerations, your History directory on the MudrikHub should not include cache files, which are extremely heavy and contain no data.
This folder should also contain an analysis log, in which you can document every analysis you perform so you could monitor your progress, what you’ve done, etc. It should describe all the experiment’s analysis steps and all the manipulations performed on the data, interim conclusions, planes of action, etc.
### 3.1.2.3 Project Hierarchy
A project can be (and for the most part is) composed of many sub-projects and experiments. In this case, the top level of your project’s folder should contain the “About” document as well as the “Paper” folder (as the paper will refer to all the experiments that this project contains), and then a sub folder for each of the sub experiments. Importantly, in this case the “About” document should also describe the different sub projects, their relations, and who was responsible for what and when.
Each of these folders should stick to the structure described above, apart from two minor changes:
- They shouldn’t include an “About” document, a**s all the relevant information for this sub experiment should already be contained in the general “About” document**
- They should not include a “Paper” as all the relevant information is going to be in the general “Paper” folder.
You can find an example for the required structure in the Common Resources/Experiment Templates/Multiple experiment project.
![enter image description here][6]
## 3.1.3 Opening your Experiment Folder: General Checklist
1. Go toMudrikHub/Experiments folder
2. Open a folder for your experiment/project and give it a short, informative name
3. Open the “experiment metadata” excel, and add a line for your experiment that includes all the information in all the columns. Open an “About” document and record everything you’ve already done
4. Use the template in Common Resources / Experiment templates to create your folder.
5. Does your experiment involve EEG? If so, copy the EEG experiment sub folder template to your folder. If not, copy the behavioral sub folder template.
# 3.2 Pre-Prints
We sometimes decide to upload pre-prints of our manuscripts. The preferred preprint servers are [bioRxiv ][14] and [PsyArXiv][15].
# 3.3 OSF
Once we finalize the experiment protocol, and after running a successful pilot, we preregister all our studies on the Open Science Framework (OSF) platform.
OSF is a web-based repository that scientists use to organize their research. OSF is free to use. To start working with OSF ([https://osf.io][16]), you should have the lab’s OSF username and password (you can ask for it from the lab manager).
## OSF experiment registration
Making a registration creates a frozen, time-stamped version of your project. Registrations cannot be changed, while projects can. Someone could remove data from a project, but not from a registration. Our registration protocol is as follows:
1. Complete the pre-registration template (that can be found in the OSF folder in the lab filestream; MudrikHub(Z:)\Common Resources\OSF)
2. Decide what will be the study’s name and write a brief description (two lines max).
3. Send these to Liad for her approval. Remember, once you register the file, no changes can be made.
4. Once you have Liad’s approval, go to https://osf.io, and enter the lab credentials and upload the pre-registration form to create and register a new project:
5. On the ‘Dashboard’ menu, press “create new project”.> add the previously approved title and description of the study. Hit the ‘Create’ button. You should add yourself as contributor (and others, if needed), but please make sure the only admin is Liad, so there won't be delays in approving the pre-registration.
6. Go to your project overview page, click on the ‘Files’ tab, and choose ‘Upload’. From your project overview page, choose the tab called ‘Registrations’ and select ‘New registration’.
7. You will receive a warning message saying this is an irreversible act. As we said, this creates a time-stamped, uneditable version of this project. Underneath you will find a drop-down that contains different options. Choose ‘OSF-Standard Pre-Data Collection Registration’.
8. On the “Registration Metadata'' menu you would be asked to choose a license. Choose “CC-By attribution 4.0 International”.
9. Usually we will answer with “no” to the following two questions: ‘Has data collection begun for this project?’ and ‘Have you looked at the data?’.
10. Under ‘Registration choice’ select the embargo option, and give one year in advance.
11. If we would like to remove the embargo earlier than expected (usually in cases the experiment is submitted for publication and is about to be published), then we can do so by entering our project > Registrations > click on the specific project > Make Public > Confirm removing the embargo earlier.
You can find examples of pre-registration forms in the Common resources folder (Z\Common Resources\OSF).
# 3.4 Archiving Inactive Datasets
Before you leave, or upon completion of a project, you must archive old datasets and back them up. Note that this should be done only after Liad’s approval. Contact the lab manager about the exact file-system folders that can be archived. This way, we keep enough space free on our lab’s filestream account for current experiments and projects.
# 3.5 Leaving the lab protocol
*Before you leave the lab, there are a few things you should take care of. To make sure everything was done properly, please set a meeting with the lab manager, in which you will go over the protocol and get a confirmation that everything has indeed been done as expected.*
The research and the experiments conducted in the lab, the methods and the collected data, belong to the lab and should be well documented, clear and available for follow-up experiments and replications.
Before leaving the lab (whether when graduating, finishing a project or a position) you should:
a. Keep in the MudrikHub drive an updated and organized version of every project you had. If the project is finished, the lab manager will transfer it from the MudrikHub drive to the Archive drive. The folders should be arranged and sorted according to the lab data organization protocol, and should include (among other things):
- All the data you collected during the research
- All the codes you used to collect and analyze the data, documented in a clear and organized way. A new person should be able to run the code files without problems or errors (e.g., mention the MATLAB version used to run the code, and the folders’ structure).
- A **detailed** ‘About’ file that describes the project, including who were the students leading it, changes that were made along the way etc.
b. An SOP file describing **exactly** how to run your experiment.
c. All the relevant information and the data you collected during your research should be saved solely in the project folder (as described above). You need to go over all the computers in the lab, including experimental computers, and all the **hard disks** in the lab, to make sure there are no folders of yours.
- If you found any information that is relevant to your research, transfer it to your project folder.
- If you wrote/created something that could be used as a common resource, please share it with the lab manager, so it will be added to the Lab Handbook or the ‘Common Resources’ folder. (Notice that you might be asked to edit the content to make sure it is clear.)
- Any other material **Should be deleted**.
d. If you acquired a skill that would be relevant to other, e.g., worked with a specific software, make sure you write a “How To” file to share this knowledge. Inform the lab manager so it could be assimilated in the lab common resources.
e. If you ran an experiment with monetary reward - set an accounting meeting with the lab manager.
f. Make sure all of your data documentation is in line with the [subjects privacy protocol][17], including documenting the participants’ IDs in the lab’s shared file.
g. Organizing your workstation, including:
- Make sure your personal lab computer does not have any irrelevant material on it (according to section 3 in the protocol)
- Sort and organize physical materials/tools you used in your research; Relevant and useful materials should be sorted and stored in a designated space after consulting with the lab manager. Irrelevant materials and equipment should be disposed.
h. When you meet with the lab manager:
- Make sure sections a-g are completed
- If you borrowed some equipment from the lab, return it
- Return your key to the lab
Attached is a checklist, for your convenience.
Thank you and good luck :)
## Lab leaving protocol - Checklist
☐ All my research projects are organized in the shared drive according to the data organization protocol:
☐ My folders include all the data that was collected
☐ All my code files are clear, comprehensive and work without problems/errors
☐ My folder includes a detailed “About” file
☐ If the project is finished, my project folder was moved from the MudrikHub drive to the Archive drive
☐ My experiments appear in the experiments meta-data file
☐ My participants’ IDs were updated in the shared lab file
☐ My consent forms were scanned and kept in the experiment folder
☐I returned my experiment nylon folder to the lab manager
☐ I’ve created a SOP file for my experiments. and it is in _______________
☐ I went over all the lab computers and hard disks:
**Computers**
☐ Ron
☐ Dobby
☐ Lupin
☐ Neville
☐ Voldemort
☐ Mcgonagall
☐ 230 - Left
☐ 230 - Right
☐ 118A
☐ 118B
☐ Dumbledore
☐ Harry
☐ Snape
☐ 213A
☐ Trelawney
☐Hagrid
☐Hedwig
**Hard disks**
☐ Anna
☐ Elsa
☐ Kristoff
☐ Olaf
☐ Sven
☐ Hans
☐Static
☐Ben-El
☐Saranga
☐ I’ve created a “How To” file for unique tools I used, and it was approved by Liad
☐ I had an accounting meeting with the lab manager for my experiment’s piggybank
☐ My workstation is clean and organized
☐ Any physical materials/tools I used were sorted and stored in a designated space
☐ I’ve set a meeting with the lab manager:
☐ coordinated the time
☐ If I borrowed equipment from the lab, I’ve returned it
☐ I’ve returned my key to the lab
☐ We went over the checklist
[1]: https://mfr.osf.io/export?url=https://osf.io/xb2ct/?direct%26mode=render%26action=download%26public_file=True&initialWidth=848&childId=mfrIframe&parentTitle=OSF+%7C+3.1.2.1.1.jpg&parentUrl=https://osf.io/xb2ct/&format=2400x2400.jpeg
[2]: https://mfr.osf.io/export?url=https://osf.io/7xa9t/?direct%26mode=render%26action=download%26public_file=True&initialWidth=848&childId=mfrIframe&parentTitle=OSF+%7C+3.1.2.1.2.jpg&parentUrl=https://osf.io/7xa9t/&format=2400x2400.jpeg
[3]: https://mfr.osf.io/export?url=https://osf.io/r85ad/?direct%26mode=render%26action=download%26public_file=True&initialWidth=848&childId=mfrIframe&parentTitle=OSF+%7C+3.1.2.1.3.jpg&parentUrl=https://osf.io/r85ad/&format=2400x2400.jpeg
[4]: https://mfr.osf.io/export?url=https://osf.io/t75j9/?direct%26mode=render%26action=download%26public_file=True&initialWidth=848&childId=mfrIframe&parentTitle=OSF+%7C+3.1.2.1.4.jpg&parentUrl=https://osf.io/t75j9/&format=2400x2400.jpeg
[5]: https://osf.io/5kfrc/wiki/Chapter%204:%20Running%20an%20Experiment/#45_Labs_personality_questionnaire_650
[6]: https://mfr.osf.io/export?url=https://osf.io/5jsdm/?direct%26mode=render%26action=download%26public_file=True&initialWidth=848&childId=mfrIframe&parentTitle=OSF+%7C+3.1.2.1.5.jpg&parentUrl=https://osf.io/5jsdm/&format=2400x2400.jpeg
[7]: https://files.osf.io/v1/resources/5kfrc/providers/osfstorage/602108ef1d985e010e743118?mode=render
[8]: https://files.osf.io/v1/resources/5kfrc/providers/osfstorage/60210a991d985e011474193e?mode=render
[9]: https://files.osf.io/v1/resources/5kfrc/providers/osfstorage/60210b66339cb60127ced54b?mode=render
[10]: https://files.osf.io/v1/resources/5kfrc/providers/osfstorage/602131201d985e011c744ddb?mode=render
[11]: https://files.osf.io/v1/resources/5kfrc/providers/osfstorage/602132711d985e0123742cc5?mode=render
[12]: https://files.osf.io/v1/resources/5kfrc/providers/osfstorage/602135201d985e01237435e7?mode=render
[13]: https://files.osf.io/v1/resources/5kfrc/providers/osfstorage/6021386f4497cf01249c232f?mode=render
[14]: http://biorxiv.org/
[15]: http://psyarxiv.com/
[16]: http://osf.io/
[17]: https://osf.io/5kfrc/wiki/Chapter%204:%20Running%20an%20Experiment/#431_Subject_Privacy_Protocol_420