Code related to “A common computational and neural anomaly across mouse models of autism”, by Noel et al. Questions should be referred to jpn5@nyu.edu.
This README documents how to download and process behavior and neural data. We house all this on OSF as opposed to GitHub in order to additionally upload summary data.
Code to reproduce figures are given within this directory and should be self-explanatory (each script labeled according to the figure).
Code to fit behavioral models is available here: https://github.com/int-brain-lab/ibl-changepoint
Code to fit encoding models is available here: https://github.com/BalzaniEdoardo/PGAM
Code for demixed PCA is available here: https://github.com/machenslab/dPCA
Code for Bayesian Adaptive Direct Search (BADS) is available here: https://github.com/lacerbi/bads
Code for cbrewer is available here: https://www.mathworks.com/matlabcentral/fileexchange/58350-cbrewer2
Code for Spikes is available here: https://github.com/cortex-lab/spikes
Code for NPY reading with MALTAB is available here: https://github.com/cortex-lab/npy_matlab
If any of these techniques are used, please cite the original sources.
Current pipeline
1. Download session data. We provide 4 lists of sessions. Under the folder labeled data_behavior, there are: behavior_list_CSP, behavior_list_FMR, behavior_list_NYU, and behavior_list_SH. These have animal name and date. With this information, you can download all the data associated with the particular session, using get_behavior_all.m. This last script has been redacted such not to include password information. You will need to get a generic password from the IBL. An alternative is to download the data using ONE, following IBL documentation here: https://int-brain-lab.github.io/ONE/
2. Behavioral summary data can be created using code/pre-processing/behavior_step1.m (for training times) and code/pre-processing/behavior_step2 (for the rest). The summary data is also directly provided here: data_behavior/summary_training_times.mat and data_behavior/summary_behavior.mat.
3. Neural summary data can be created by using code/pre-processing/neurons_step1.m. This script does all the heavy lifting. The other scripts (neurons_step2.m and neurons_step3.m) simply coalesce regardless of layer (though that information is maintained within the files) and prunes to only keep areas where we have a representative number of neurons.
Running code/pre-processing/neurons_step1.m will results in .mat organized as follows:
Each .mat is for a given session/animal.
In naming of files, “VISp”, for example, indicates that it's from primary visual area in the mouse. As defined by the Allen Institute CFF.
The name of the animal indicates if it's a control animal (NYU) or a 'mouse model of autism' (either SH, CSP, or FMR).
Each .mat has a dat structure. This contains:
-b.Chioice: -1 if choice left, 1 if choice right, 0 if no choice
-b.probLeft: The experimentally imposed prior, takes value .5, .2, or .8, meaning there is either a 50%, 20% or 80% chance the stimuli will appear on the left.
-b.contrastL: Contrast on the left side (can be either 1, .25, .125, 0.0625, 0, or nan). NaN if the Gabor appears on the other side. 0 if the contrast is null, but still this side is the 'correct one' given the prior.
-b.contrastR: As above, for the right
-b.t: structure with all the timings:
- feedback: auditory tone + water if correct
- response: when the response was logged (when the Gabor reached the center of the screen)
- movement: first movement, can be thought of as reaction time
- go cue: I believe this is when the stimuli appeared (given contrast is not zero), but I may be wrong
- go cue trigger: Same, one is software defined and the other is hardware defined.
- Stim Off: self explanatory
- Stim On: self explanatory
- b.feedback: '1' if correct trial, '-1' if incorrect
- b.reward_volume: how much water was delivered, in microL
- b.estimated_prior: This is the estimated prior (i.e., "subjective prior") of the animal. The models are akin to these: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006681
- b.params: parameters of the best fitting behavioral model
- "w" (for "wheel") - Here you have more 'analog' data about the wheel position. Fields are self explanatory
- "c" (for "camera"). We have 3 cameras (at different sampling rates), either looking at the animal from the top ("body camera") or left or right (aiming at the paws and face). For each of dat.c.left or .right or .body you will find:
.info.VariablesNames will give you the body part we are tracking, as outputed by DLC (deep lab cut). Here are some examples:
"nose_tip_x"
"nose_tip_y"
"nose_tip_likelihood"
"pupil_top_r_x"
"pupil_top_r_y"
"pupil_top_r_likelihood"
"pupil_right_r_x"
"pupil_right_r_y"
"pupil_right_r_likelihood"
"pupil_bottom_r_x"
"pupil_bottom_r_y"
"pupil_bottom_r_likelihood"
"pupil_left_r_x"
"pupil_left_r_y"
"pupil_left_r_likelihood"
"paw_l_x"
"paw_l_y"
"paw_l_likelihood"
"paw_r_x"
"paw_r_y"
"paw_r_likelihood"
"tube_top_x"
"tube_top_y"
"tube_top_likelihood"
"tube_bottom_x"
"tube_bottom_y"
"tube_bottom_likelihood"
"tongue_end_l_x"
"tongue_end_l_y"
"tongue_end_l_likelihood"
"tongue_end_r_x"
"tongue_end_r_y"
"tongue_end_r_likelihood"
.d: this wil give you 'data', in the same order as .info.VariablesNames (time x variable name)
.t: time
- "probe", this will give you meta data about the probe, x (medio lateral), y (anterior posterior), z (depth location), etc. Not super important as we have this info per neuron as well.
- "n" (for "neuron").
This will be a 1 x N structure, with N being the number of neurons recorded from simultaneously.
- .st are spike times, same 'axis' as the behavior stuff, time stim on, etc.
- .id an ID for the particular neuron, as given by Kilosort (the spike sorting algorithm)
- .amp, mean amplitude of the neuron
- .fr, mean firing rate (this is computed by the sorting algorithm, so before any epoching, can be different from what you compute once the data is epoched)
-.brain_region: This is a bit more detailed than what you get based on the file name, include a layer
- .brain_region_id; ID for CFF, doesn't really matter
-.x, .y., z. location of the specific electorde that picked up on that neuron.
4. for dPCA, run code/pre-processing/dPCA_step1.m
5. For pGAM, we generate a list of fits to be undertaken, and provide a .m function in order to format the data. Respectively: code/pre-processing/GAM_get_list.m and GAM_Step1. The actual fits occur via SLURM on NYU cluster, see pGAM repository linked above.
6. For pupil diameter, run code/pre-processing/Pupil_step1.m and un code/pre-processing/Pupil_step2.m
7. For pGAM results, run code/utils/get_gam_result.m and code/utils/get_gam_results_by_prior.m