# Kvasir-Capsule This is the official OSF repository for the Kvasir-Capsule dataset, which is the largest publicly released VCE dataset. In total, the dataset contains 47,238 labeled images and 117 videos, where it captures anatomical landmarks and pathological and normal findings. The results is more than 4,741,621 images and video frames all together. Some users experience problems with downloading the data from OSF. All data is also available here: and here as zip file. ## Dataset Details The dataset can be split into three distinct parts; Labeled image data, labeled video data, and unlabelled video data. Each part is further described below. **Labeled images** In total, the dataset contains 47,238 labeled images stored using the PNG format. The images can be found in the images folder. The classes that each of the images belongs correspond to the folder they are stored. For example, the ’polyp’ folder contains all polyp images, and the ’Angiectasia’ folder contains all images of Angiectasia. The number of images per class is not balanced, which is a common challenge in the medical field because some findings occur more often than others. This adds an additional challenge for researchers since methods applied to the data should also be able to learn from a small amount of training data. The labeled images represent 14 different classes of findings. Furthermore, the labeled image data includes bounding box coordinates, which can be found in the *metadata.csv* file. **Labeled videos** The dataset contains a total of 43 labeled videos containing different findings and landmarks. This corresponds to approximately 19 hours of video. Each video has been manually assessed by a medical professional working in the field of gastroenterology and resulted in a total of 47,238 annotated findings, and more than 2 million video frames that can be converted to images if needed **Unlabeled videos** In total, the dataset contains 74 unlabeled videos, which is equal to approximately 25 hours of video and 2,785,829 video frames. **Terms of use** The data is released fully open via Creative Commons Attribution 4.0 International (CC BY 4.0). In all work, documents and papers that use or refer to the dataset or report experimental results based on the Kvasir-Capsule, a reference to the article describing the dataset must to be added:
