# Image set info
The training and test images came from the [THINGS database][things].
## Training images partition
### Data directory structure
The training images partition has 1,654 object concepts, with 10 images per concept (for a total of 16,540 image conditions), and is organized in the following directory structure:
/training_images
│
└───00001_aardvark
│ └───aardvark_01b.jpg
│ └───aardvark_02s.jpg
│ └───aardvark_03s.jpg
│ └───aardvark_04s.jpg
│ └───aardvark_05s.jpg
│ └───aardvark_06s.jpg
│ └───aardvark_07s.jpg
│ └───aardvark_08s.jpg
│ └───aardvark_09s.jpg
│ └───aardvark_10s.jpg
│
└───00002_abacus
│ └───abacus_01b.jpg
│ └───abacus_02s.jpg
│ ...
│ └───abacus_10s.jpg
│
...
│
└───01654_zucchini
└───zucchini_01b.jpg
└───zucchini_02s.jpg
...
└───zucchini_10s.jpg
### How to match the training images with the corresponding training EEG trials
##### Raw EEG training data
The stimulus channel of the [raw EEG training data][raw_eeg] contains 16,540 different image condition event types, ranging from 1 to 16,540. Each of these events is linked to the corresponding training image condition following a nested loop with the outer loop iterating through the 1,654 sorted object concepts and the inner loop iterating through the 10 sorted images per object concept. For example, the EEG trials with `event=1` will correspond to the image `/00001_aardvark/aardvark_01b.jpg`, the EEG trials with `event=2` will correspond to the image `/00001_aardvark/aardvark_02s.jpg`, the EEG trials with `event=11` will correspond to the image `/00002_abacus/abacus_01b.jpg`and so on.
##### Preprocessed EEG training data
The [preprocessed EEG training data][prepr_eeg] is organized into a 4-dimensional `.npy` array. The first dimension of the array is the image condition dimension, and therefore has a lenght of 16,540. Each element of the first dimension is linked to the corresponding training image following a nested loop with the outer loop iterating through the 1,654 sorted object concepts and the inner loop iterating through the 10 sorted images per concept. For example, the EEG trials indexed in Python with `preprocessed_eeg_training[0,:,:,:]` will correspond to the image `/00001_aardvark/aardvark_01b.jpg`, the EEG trials indexed with `preprocessed_eeg_training[1,:,:,:]` will correspond to the image `/00001_aardvark/aardvark_02s.jpg`, the EEG trials indexed with `preprocessed_eeg_training[10,:,:,:]` will correspond to the image `/00002_abacus/abacus_01b.jpg`, and so on.
## Test images partition
### Data directory structure
The test images partition has 200 object concepts, with 1 image per concept (for a total of 200 image conditions), and is organized in the following directory structure:
/test_images
│
└───00001_aircraft_carrier
│ └───aircraft_carrier_06s.jpg
│
└───00002_antelope
│ └───antelope_01b.jpg
│
...
│
└───00200_wok
└───wok_12s.jpg
### How to match the test images with the corresponding test EEG trials
#### Raw EEG test data
The stimulus channel of the [raw EEG test data][raw_eeg] contains 200 different image condition event types, ranging from 1 to 200. Each of these events is linked to the corresponding test image condition following the numerical indices of the test object concepts subfolders. For example, the EEG trials with `event=1` will correspond to the image `/00001_aircraft_carrier/aircraft_carrier_06s.jpg`, the EEG trials with `event=2` will correspond to the image `/00002_antelope/antelope_01b.jpg`, and so on.
##### Preprocessed EEG test data
The [preprocessed EEG test data][prepr_eeg] is organized into a 4-dimensional `.npy` array. The first dimension of the array is the image condition dimension, and therefore has a lenght of 200. Each element of the first dimension is linked to the corresponding test image comndition following the numerical indices of the test object concepts subfolders. For example, the EEG trials indexed in Python with `preprocessed_eeg_test[0,:,:,:]` will correspond to the image `/00001_aircraft_carrier/aircraft_carrier_06s.jpg`, the EEG trials indexed with `preprocessed_eeg_test[1,:,:,:]` will correspond to the image `/00002_antelope/antelope_01b.jpg`, and so on.
## Metadata: image concepts
The training and test image concepts, their mapping to the original THINGS concepts order, and the image file names are found in the file `image_metadata.npy`, a Python dictionary with the following keys:
* `train_img_concepts`: list of strings containing the concept names of the 16,540 training images, ordered alphabetically, and additionally sorted through numbers ranging from 1 to 1,654.
* `train_img_files`: list of strings containing the filenames of the 16,540 training images.
* `train_img_concepts_THINGS`: list of strings containing the concept names of the 16,540 training images, ordered alphabetically, and additionally sorted through the original THINGS concept numbers, ranging from 1 to 1,854.
* `test_img_concepts`: list of strings containing the 200 test image concept names, ordered alphabetically, and additionally sorted through numbers ranging from 1 to 200.
* `test_img_files`: list of strings containing the filenames of the 200 test images.
* `test_img_concepts_THINGS`: list of strings containing the 200 test image concept names, ordered alphabetically, and additionally sorted through the original THINGS concept numbers, ranging from 1 to 1,854.
As an example, the first test image (with filename `aircraft_carrier_06s.jpg`) is labeled `00001_aircraft_carrier` in the `test_img_concepts` variable, and `00010_aircraft_carrier` in the `test_img_concepts_THINGS` variable, indicating that it corresponds to image concept 10 of the THINGS database.
Additional metadata, such as the 27 high-level image categories, can be found in the [THINGS database][things_osf].
## Additional info
[Here][colab] you will find a Colab interactive tutorial on how to load, visualize and link the stimuli images to the corresponding EEG data.
[things]: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0223792
[raw_eeg]: https://osf.io/crxs4/
[prepr_eeg]: https://osf.io/anp5v/
[things_osf]: https://osf.io/xtafs/
[colab]: https://colab.research.google.com/drive/1i1IKeP4cK3ViscP4b4kNOVo4kRoL8tf6?usp=sharing