Table of Contents
- Background
- Rationale
- Merging tools: ppmi_merger.py; IQT; Clinica
- Measuring tools: HCP SC GIF; NetMON; CBIG
- Modelling tools: KDE EBM; pySuStaIn; mechanistic-profiles; EuroPOND modelling software toolbox
This toolbox contains research code for Merging and harmonizing data, Measuring in vivo brain circuit anatomy and activity, and Modelling neurodegenerative disease progression. We focus on tools we developed (or helped to develop), and those of our collaborators.
Ultimately we aim to provide a toolbox of research software (as opposed to code) that can facilitate (but not necessarily automate) your own end-to-end analyses of neurodegenerative diseases to reveal insight into disease biology/mechanisms and actionable information for medicine and healthcare.
These tools include scripts and pipelines for building circuit-based (a.k.a., network-based, connectome-based) quantitative signatures of disease progression from large Imaging Plus X multimodal data sets.
The toolbox was inspired by the EuroPOND modelling software toolbox. See also the didactic resources for Modelling over in the new Disease Progression Modelling initiative.
We are interested in big, diverse, data for maximising generalisability and robustness of research findings. So, we envisage including both high-quality data (typically from research studies) and the relatively low-quality data emerging from clinical acquisitions (such as hospitals). Necessarily, this begins with an element of data harmonisation/cleaning, which we call Merging (including image quality improvement). Next we provide Measuring tools for producing in vivo estimates of brain circuit anatomy/structure (e.g., connectomes) and activity/function. Finally, we produce quantitative signatures of disease progression from Modelling for unravelling the temporal and phenotypic heterogeneity seen in neurodegenerative diseases such as Alzheimer's and Parkinson's.
During development of the toolbox, we have the following sub-aims:
- Case Studies on publicly available data: provide useful research code, e.g., merging PPMI spreadsheets, that can facilitate/inspire others.
- Reproducibility and open science: provide transparency for our own analyses.
Tools to prepare large neuroimaging datasets for analysis, including harmonisation.
Expected to be useful for the following:
- Research analysing publicly available data from large neurodegenerative disease research data sharing initiatives such as ADNI, PPMI, AIBL, ADCS, OASIS, PDBP, PREVENT-AD, DIAN, etc.
- Neuroimaging research/analyses involving low-quality clinical data, in particular connectomic analyses using DTI
Data sharing is becoming the norm in neurodegenerative disease research, especially from studies supported by public resources. Large publicly available datasets typically provide neuroimaging data (either raw or pre-processed) plus a set of spreadsheets containing clinical data such as demographics, symptoms, and test scores from neuropsychological examinations. Very few public datasets provide a merged spreadsheet/table for multimodal analysis (ADNI being the obvious exception).
- ppmi_merger.py is a python script to merge tables/spreadsheets of interest from the Parkinson's Progression Markers Initiative (PPMI) study.
- Image Quality Transfer is a machine learning tool for performing super-resolution of diffusion MRI data.
- Paper: Alexander et al., NeuroImage (2017)
- Code: IQT (MATLAB)
Image Quality Transfer (IQT) aims to bridge the technological gap that exists between bespoke and expensive experimental systems such as the Human Connectome Project (HCP) scanner and accessible commercial clinical systems using machine learning (ML). The technique learns mappings from low quality (e.g. clinical) to high quality (e.g. experimental) images exploiting the similarity of images across subjects, regions, modalities, and scales: image macro- and meso-structure is highly predictive of sub-voxel content. The mapping may then operate directly on low-quality images to estimate the corresponding high-quality images, or serve as a prior in an otherwise ill-posed image-reconstruction routine.
The current version provides a MATLAB implementation of IQT for super-resolution of diffusion tensor images (DTIs) using random forests (RFs).
Typical visualisation from test_rf.m
illustrating results of 3x super-resolution
with 3x3x3 input patch on subject 117324. Note that no boundary completion was performed here.
Clinica is a general software platform for multimodal brain image analysis in clinical research studies, integrating a comprehensive set of processing tools for the main neuroimaging modalities: currently MRI (anatomical, functional, diffusion) and PET. Future: EEG/MEG.
Tools to analyse neuroimaging data, with a focus on brain circuit/network analyses.
Third-party software is often involved. A future aim is to provide conda/Docker configurations.
- HCP Data (preprocessed images)
- MRtrix3 structural connectome using GIF parcellation.
- Code: HCP SC GIF
- ADNI/PPMI connectome pipeline (raw images)
- Includes the necessary preprocessing (but not IQT)
- Code: NetMON repository.
- Papers: Oxtoby, et al., Frontiers in Neurology, 2017
- Thomas Yeo's Computational Brain Imaging Group tools
- fMRI preprocessing, et al.
- Code: CBIG
Tools to analyse typically large neuroimaging datasets to understand disease progression.
Expected to be useful for the following:
- Patient stratification
- Clinical trial enrichment: decreasing variability (precision staging); identifying high-risk patients (prognostic enrichment); identifying likely responders (predictive enrichment).
- Discovery of data-driven disease progression subtypes
- et al.
- The KDE EBM is an event-based model implemented with KDE mixture modelling under the hood. This generalises previous EBMs to allow direct inclusion of highly skewed data such as coming from clinical scores.
- Paper: Firth et al., Alzheimer's & Dementia (2020)
- Code: KDE EBM (python)
Toy examples of biomarker overlap (patients and controls) and non-Gaussianity that the KDE EBM can handle Firth et al., Alzheimer's & Dementia (2020).
Subtype and Stage Inference is an algorithm for discovery of data-driven groups or "subtypes" in chronic disorders.
Subtyping (left) and Staging (right) of ADNI data. From Young et al., Nature Communications (2018).
- pySuStaIn is the python implementation of SuStaIn, with the option to describe the subtype progression patterns using either an event-based model or a piecewise linear z-score model.
Topological Progression Profiles (TPPs) are a characteristic combination of topological descriptors that best describes the propagation of pathology in a particular disease. This is a method for understanding the relationship between brain circuits and spatiotemporal patterns of brain pathology in neurological diseases, at both cohort- and individual-levels.
By combining data-driven disease progression modelling with learning a weighted combination of topological descriptors, TPPs explain observed pathology better than profiles estimated when using end-stage data only, and better than single topological descriptors (such as testing disease mechanism hypotheses independently: Zhou...Seeley, Neuron (2012)).
TPPs suggest new insights into the biological mechanisms underlying pathology propagation in neurological diseases.
Topological Progression Profiles in Alzheimer's, Primary Progressive Multiple Sclerosis, and Healthy Ageing. From Garbarino et al., eLife (2019).
- mechanistic-profiles is the MATLAB implementation of TPPs.
Other Modelling tools available in the EuroPOND modelling software toolbox:
- EBM and pyEBM (Event Based Model): pathophysiological cascades
- LeaSP (LEArning Spatiotemporal Patterns): Disease Course (trajectory) Mapping
- Deformetrica: Brain Shape progression modelling and statistics on shapes and neuroimages
- DIVE (Data-driven Inference of Vertexwise Evolution): learn clusters of probabilistic trajectories of cumulative abnormality across the brain surface (cortex) over the full time course of a disease.
- GPPM (Gaussian process progression model): simultaneously learn group-level probabilistic trajectories and a reparameterised disease time (age + disease-related time-shift), over the full time course of a disease.