## **From dozens to thousands: important lessons when scaling up structural MRI processing using CAT** ##
**Felix Hoffstaedter**, Research Centre Julich, Institute of Neuroscience and Medecine
**Relevant work:** The FAIRly big [journal article][1]
**Softwares/programs requirements:** [DataLad][2], [Computational Anatomy Toolbox][3], [FAIRly Big Workflow][4], [SPM][5], [ENIGMA toolbox][6]
**AOMICS dataset:** [ID1000][7], [PIOP1][8], [PIOP2][9]
**Modalities:** T1w
**Slides:** [found here][10]
A typical MRI processing pipeline with 20-50 subjects is not practical/feasible with 200/1000 subjects. This tutorial will start with a theoretical component discussing why an explicit planning phase, the use of provenance tracking, and automatic QC are mandatory for working with large datasets. The use of containers and a scheduler are also desirable. Next, this talk will demonstrate the DataLad-based (www.datalad.org) Computational Anatomy Toolbox (CAT) preprocessing pipeline with an open data example. The presentation will conclude by discussing important aspects of large statistical group models in CAT.
[1]: https://doi.org/10.1038/s41597-022-01163-2
[2]: http://Datalad.org
[3]: http://www.neuro.uni-jena.de/cat/
[4]: https://github.com/psychoinformatics-de/fairly-big-processing-workflow
[5]: https://www.fil.ion.ucl.ac.uk/spm/
[6]: https://enigma-toolbox.readthedocs.io/en/latest/
[7]: https://openneuro.org/datasets/ds003097
[8]: https://openneuro.org/datasets/ds002785/versions/2.0.0
[9]: https://openneuro.org/datasets/ds002790/versions/2.0.0
[10]: https://slides.com/felix_h/building-from-dozens-to-thousands-important-lessons-when-scaling-up-structural-mri-processing-using-cat-15a81e/fullscreen