Scripts for Managing fMRI Data

#+ search mode org blah

Goals

This document describes the suite of scripts written to take a SPM batch file created for one subject and generate identical batch files for many subjects. These batch files are then submitted to the cluster via bsub or sbatch, eliminating the need for the GUI, and speeding up the process of analyzing (or reanalyzing) many subjects at once. In order to use the scripts, create your desired batch (preproc, or level1) in the SPM interface and use the ‘save as script’ option when saving the batch. This creates a file called batchname_job.m, which will serve as your template for the scripts below.

Prerequisites

System requirements: All scripts require that you be on the cluster, and were tested using Matlab 2012a with spm version 8.4667, and python version 2.7.3. Use with older versions of matlab is unsupported, and is known to fail using Matlab 2007. In order to access the scripts, you should be using the default environment as specified on the FAQ site (http://cbs.fas.harvard.edu/science/core-facilities/neuroimaging/information-investigators/faq). Please also see the FAQ for detailed information on how to select your matlab and SPM versions.

Directory structure: The scripts rely on the following directory structure which will automatically be created for you via the first script, getSubjectSPM.py.

|-mystudydir   
|---120418_mysubject
|------RAW
|---------dicom files...
|------analysisdir
|---------paradigms
|------------run001,run002,...
|---------------cond1.txt, cond2.txt...
|---------spm files from first level analysis...
|------batch
|---------spm batches...
|------preproc
|---------converted dicoms, preprocessed data...
|------output_files
|---------files containing error messages from scripts

Summary

The following scripts have been written and should be included in your default path if you are using the one listed on the FAQ site

Script Name	Details
`getSubjectSPM.py`	Pulls data using `cbsget` and creates a directory structure
`genPreprocBatches.py`	Generates preprocessing batches from template batch and executes batches
`genL1Batches.py`	Generates level 1 batches from template and executes them

`getSubjectSPM.py`

getSubjectSPM.py is a script that pulls data from the CBSCentral repository (or a directory), converts it to SPM format (img or nii), and creates a directory structure that will allow the use of the other scripts described here.

It takes the following inputs in any order. For help, type getSubjectSPM.py -h:

getSubjectSPM.py -s SUBJECTID -b BOLDRUNS -t STRUCTRUNS -m FIELDMAP -p DESTINATIONPATH -d DICOMPATH -n USENIFTI

Flag	Stands for	Description
-h	help	shows usage message and exits
-s	subjectid	the name of the subject
-b	boldruns	a list of run numbers for the BOLD scans, seperated by spaces
-t	structrun	the run number for the structural scan
-m	fieldmap runs	the run numbers for the fieldmap, if present needs two runs
-p	destination path	the FULL path to the location to place the output, no relative path (i.e. no ../ or ~)
-d	dicom path	the path to a single directory containing all of the unzipped dicoms (as opposed to using CBSCentral
-n	use nifti	If present, will use the single file NIFTI format (nii) insted of img/hdr
–analysis-dir	analysis directory	a list of directories for future analyses. Default is analysis. Useful if collected multiple experiments in one session.
–vmem	virtural memory	amount of memory requested, requred if using sbatch, default is 1024MB
–time	time	amount of time script will need, required if using sbatch, default is 2 hours
-run-with		what to execute the script with, choices are bsub, sbatch, or dry. dry means just output the call to getsubject.m, default is sbatch

For example:

~getSubjectSPM.py -s 120418_spmtest -b 6 10 11 -t 3 -m 6 7 -p /ncf/mylab/myspace/myexp/ -n --analysis-dir analysis_univariate analysis_mvpa --run-with bsub

For getting dicom’s locally as opposed to CBSCentral (make sure not to have final / after path to dicoms):

~getSubjectSPM.py -s 120418_spmtest -b 6 10 11 -t 3 -m 6 7 -p /ncf/mylab/myspace/myexp/ -d /ncf/mylab/myspace/myexp/dicoms/subj1

This creates the following directory tree:

|-myexp   
|----120418_spmtest
|-------RAW
|-------analysis
|----------paradigms
|-------------run001,run002,...
|-------batch
|-------preproc
|-------output_files

Within the RAW directory is a tarball (subjectid.tar.gz) containing the DICOMs in a compressed format. In the preproc directory will be the SPM converted files either nii or .img and .hdr.

The files have also been renamed. Because they are already in the subject directory, they have been stripped of their subjectid, and are renamed as follows:

File name	Description
f-run001-006.img	Image 6 of the first BOLD run
s-struct.img	The structural image for the subject
s-fieldmap-mag-01.img	The magnitude of the fieldmap (if provided)
s-fieldmap_phase.img	The phase of the fieldmap

Errors

If there is a problem with the script, the output will go to the screen (standard out) for debugging. Most likely issues are not having a config file for CBSget (see FAQ), having the wrong numbers for your bold runs, or the subject name of the data you are trying to unpack already exists.

`genPreprocBatches.py`

The goal of this script is to take a batch file created to perform preprocessing on a single subject and use it to analyze many subjects. This is done by saving your batch via the ‘save as script’ command in SPM. This creates a batchname_job.m file, which will serve as your template batch. This batch will be applied to all of the subjects provided, which can include the original subject that was used to create the template. This script has been tested with fieldmap, slice time correction, motion correction, indirect spatial normalization, and smoothing. If you use any additional steps, you should check that the generated batches are correct by comparing the ones created to the original.

genPreprocBatches.py -t TEMPLATE -p PATH -s SUBJECT1 SUBJECT2
or \ genPreprocBatches.py -t TEMPLATE -p PATH -f SUBJECTFILE

Flag	Stands for	Description
-h	help	provides usage message and then exits
-t	template batch	the full path to, and name of the template batch created in the SPM GUI via a “save batch as script” command, that ends in _job.m
-p	path	the path to the directory that contains all of your subjects
-s	subjid	a subjid to create and execute the batch on, can be a list separated by spaces
-f	subject file	a file containing your subjectids, with each ID on its own line, which can be used instead of -s flag

For example:

~genPreprocBatches -t /ncf/mylab/myspace/myexp/subject1/batch/preproc_job.m -p /ncf/mylab/myspace/myexp/ -s subject1 subject2~

This will create a batch file for each subject provided, and save it in subjid/batches. It will then bsub the created batch. You can check that your submitted jobs are running via the bjobs command (see FAQ for instructions).

Errors

If there is a problem with converting the template batch for each subject, the error messages will be placed in the the study directory mystudy, with the name errors_preproc followed by the date and time (to the min).

For example: errors_preproc2012_07_06_10h_23m

The output from the running of the batch (that comes via the bsub output) will be stored in subjid/output_files, with the name output_preproc followed by the date and time (to the min). This is where errors thrown by matlab or SPM will show up.

For example: output_preproc2012_06_20_11h_41m

`genL1Batches.py`

The goal of this script is to take a batch file created to perform first level analysis on a single subject and use it to analyze many subjects. This is done by saving your batch via the ‘save as script’ command in SPM. This creates a batchname_job.m file, which will serve as your template batch. This batch will be applied to all of the subjects provided, which can include the original subject that was used to create the template. To run this script, you need to have your paradigm files constructed.

Creating batch

There are a few quirks about how you can create your level1 batch.

dont use the ‘replicate Subject/Session’ option in fMRI model specification.
The names you use for your conditions will need to be the names of the text

files containing your stimulus onset values (see below), so don’t put spaces in the name.

When you make your contrasts in Contrast Manager, you can use either the T- and F-contrasts,

or the T-contrast (cond/sess based) options. Do not use the Replicate option. The cond/sess method is preferred, as it is harder to make errors. However, you will still need to build your F-contrast by hand.

Running script

genL1Batches -t TEMPLATE -p PATH -s SUBJECT1 SUBJECT2
or \ genL1Batches -t TEMPLATE -p PATH -f SUBJECTFILE

Flag	Stands for	Description
-h	help	provides usage message and then exits
-t	template batch	the full path to, and name of the template batch created in the SPM GUI via a “save batch as script” command, that ends in _job.m
-p	path	the path to the directory that contains all of your subjects
-s	subjid	a subjid to create and execute the batch on, can be a list separated by spaces
-f	subject file	a file containing your subjectids, with each ID on its own line, which can be used instead of -s

For example:

~genL1Batches -t /ncf/mylab/myspace/myexp/subject1/batch/L1_job.m -p /ncf/mylab/myspace/myexp/ -s subject1 subject2~

This will create a batch file for each subject provided, and save it in subjid/batches. It will then bsub the created batch. You can check that your submitted jobs are running via the bjobs command (see FAQ for instructions).

Stimulus onset files

Within the analysis directory is a paradigms directory, with a directory for each run, run001. For the first level analysis, each condition should have it’s own onset text file, with each row being a single onset time. The name of the file should be the name given to each condition within the SPM batch, followed by the .txt extension, cond1.txt. Therefore, if you have 3 runs, you will end up with three text files for cond1. They will all be called cond1.txt, but placed in each run directory run001, run002, and run003. If your stimulus is presented 4 times per run, than each of those files will have 4 rows, with each row having the time in seconds (or TRs, depending on what you specify in your batch) when your stimulus was presented. These can be made up in matlab, or any text editor.

Errors

If there is a problem with converting the template batch for each subject, the error messages will be placed in the the study directory mystudy, with the name errors_L1 followed by the date and time (to the min).

For example: errors_L12012_07_06_10h_23m

The output from the running of the batch (that comes via the bsub output) will be stored in subjid/output_files, with the name output_L1 followed by the date and time (to the min). This is where errors thrown by matlab or SPM will show up.

For example: output_L12012_06_20_11h_41m

`genArtFiles.py`

The goal of this script is to set up files and parameters for rerunning your level1 analysis with ART. Currently, the global_mean type is hard coded to be type 1, or standard, and the motion_file_type is set to 0, for a SPM .txt file.

genArtFiles -p PATH -s SUBJECT1 SUBJECT2 -gt GLOBALTHRESHOLD -mt MOTIONTHRESHOLD -g DIFFGLOBAL -m DIFFMOTION -n NORMS
or \ genArtFiles -p PATH -f SUBJECTFILE -gt GLOBALTHRESHOLD -mt MOTIONTHRESHOLD -g DIFFGLOBAL -m DIFFMOTION -n NORMS

Flag	Stands for	Description
-h	help	provides usage message and then exits
-p	path	the path to the directory that contains all of your subjects
-s	subjid	a subjid to create and execute the batch on, can be a list separated by spaces
-f	subject file	a file containing your subjectids, with each ID on its own line, which can be used instead of -s
-gt	global mean threshold	threshold for excluding outliers, in stdev away from the mean
-mt	motion threshold	threshold for excluding outliers, in mm of movement
-g	global diff	1=yes, 0=no, whether you want to ‘Use Differences” for global mean threshold
-m	motion diff	1=yes, 0=no,	use movement differences, not absolute from first tp
-n	use norms	1=combine all movement directions (linear and angular) 0=no

For example:

~genArtFiles -p /ncf/mylab/myspace/myexp/ -s subject1 subject2 -gt 2 -mt .5 -g 0 -m 0 -n 1~

This will create a new directory called art_analysis, at the same level as the original analysis directory. This directory will contain several files need for Art, or created by Art: art_config001.cfg, art_exec001.m, art_mask.hdr/img, art_mask_temporalfile.mat, SPM_stats_file. It will also created new regression file for regressing out outliers with or without motion (art_regression_outliers_swrf-run001-001.mat, or art_regression_outliers_and_movement_swrf-run001-001.mat). There will be one of each for every run.

Errors

If there is a problem creating the files for Art, the error messages will be placed in the the study directory mystudy, with the name errors_ART followed by the date and time (to the min).

For example: errors_ART_2012_07_06_10h_23m

The output from the running of the batch (that comes via the bsub output) will be stored in subjid/output_files, with the name output_ART followed by the date and time (to the min). This is where errors thrown by matlab or SPM will show up.

For example: output_ART2012_06_20_11h_41m

`genL1ArtBatches.py`

The goal of this script is to take a batch file created to perform first level analysis using ART outlier exclusion on a single subject and use it to analyze many subjects. The usage and output is the same as genL1Batches except that the output goes in to art_analysis directory.

Running script

genL1ArtBatches -t TEMPLATE -p PATH -s SUBJECT1 SUBJECT2
or \ genL1ArtBatches -t TEMPLATE -p PATH -f SUBJECTFILE

Flag	Stands for	Description
-h	help	provides usage message and then exits
-t	template batch	the full path to, and name of the template batch created in the SPM GUI via a “save batch as script” command, that ends in _job.m
-p	path	the path to the directory that contains all of your subjects
-s	subjid	a subjid to create and execute the batch on, can be a list separated by spaces
-f	subject file	a file containing your subjectids, with each ID on its own line, which can be used instead of -s

Errors

If there is a problem with converting the template batch for each subject, the error messages will be placed in the the study directory mystudy, with the name errors_L1ART followed by the date and time (to the min).

For example: errors_L1ART2012_07_06_10h_23m

The output from the running of the batch (that comes via the bsub output) will be stored in subjid/output_files, with the name output_L1ART followed by the date and time (to the min). This is where errors thrown by matlab or SPM will show up.

For example: output_L1ART2012_06_20_11h_41m

Acknowledgments

These scripts were written by Alex Storer, Caitlin Carey and Stephanie McMains with additional assistance from David Dodell-Feder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.org

README.org

Scripts for Managing fMRI Data

Goals

Prerequisites

Summary

`getSubjectSPM.py`

Errors

`genPreprocBatches.py`

Errors

`genL1Batches.py`

Creating batch

Running script

Stimulus onset files

Errors

`genArtFiles.py`

Errors

`genL1ArtBatches.py`

Running script

Errors

Acknowledgments

Files

README.org

Latest commit

History

README.org

File metadata and controls

Scripts for Managing fMRI Data

Goals

Prerequisites

Summary

getSubjectSPM.py

Errors

genPreprocBatches.py

Errors

genL1Batches.py

Creating batch

Running script

Stimulus onset files

Errors

genArtFiles.py

Errors

genL1ArtBatches.py

Running script

Errors

Acknowledgments

`getSubjectSPM.py`

`genPreprocBatches.py`

`genL1Batches.py`

`genArtFiles.py`

`genL1ArtBatches.py`