This dataset was created as a subset of Ecoset (Mehrer et al. PNAS 2021) to allow for rapid prototyping of neural networks while having access to lots of images per class (both in training and testing) and a good diversity of classes.
The images are 64 x 64 pixels RGB.
There are 100 classes in the dataset. Each class has four splits containing: 2350 training, 50 validation, 50 test images, and 250 "testplus" images per class.
The test images come from the Ecoset test set, whereas "testplus" refers to the test set + 200 images from the Ecoset train set (not in the miniecoset train set). This split was created for in-depth analysis of networks trained on Miniecoset.
There are 20 classes each from the following superordinate categories:
- natural-animate-mammals (e.g. human, cats)
- natural-animate-rest (e.g. reptiles, fish)
- natural-inanimate (e.g. plants, frutis)
- artificial-small (e.g. camera, pizza)
- artificial-large (e.g. bicycle, piano)
Classes are arranged in the same order as the above listed superordinate categories.
The fasttext embeddings (Mikolov et al. LREC 2018) for the 100 classes are also included.
The .h5 file can be found here. You can access the data inside it in python as follows:
import h5py
with h5py.File(dataset_path, "r") as f:
# get all the keys to explore further
print(f.keys())
# an example for extracting data
train_images = f['train']['data'][()]
# an example for extracting all classes
classes = f['categories'][()]
<hr>
Cite this paper, which introduced MiniEcoset:
Thorat, Sushrut, Adrien Doerig, and Tim C. Kietzmann. "Characterising representation dynamics in recurrent neural networks for object recognition." arXiv preprint arXiv:2308.12435 (2023).