Estimating longitudinal analytic flexibility in the UK Biobank with genetic and imaging phenotypes

Poster No:

1939 

Submission Type:

Abstract Submission 

Authors:

Brent McPherson1, Shadi Zabad1, Yue Li1, Jean-Baptiste Poline1

Institutions:

1McGill University, Montreal, Quebec

First Author:

Brent McPherson, PhD  
McGill University
Montreal, Quebec

Co-Author(s):

Shadi Zabad  
McGill University
Montreal, Quebec
Yue Li  
McGill University
Montreal, Quebec
Jean-Baptiste Poline  
McGill University
Montreal, Quebec

Introduction:

A key goal of clinical neuroscience is to reliably map the health trajectory of an individual at risk of developing a disease. Unfortunately, models developed on cross-sectional data are increasingly failing to reproduce the trends present in longitudinal samples 1,2. Further, the recruitment of a sufficient number of individuals to a study is largely impossible for all but the largest labs 1. These challenges have led modern neuroscience to increasingly form large, collaborative, and longitudinally sampled population cohorts to capture comprehensive multimodal phenotypes about individuals 3,4. The transition to these large population samples changed the kind of quality assessment any observation receives. Many large datasets, like the UK Biobank (UKB), have begun generating a series of image derived phenotypes (IDPs) and observing their distributions for outliers to determine an observation's quality. To evaluate the analytic flexibility of these IDPs we have reproduced many of them using an equivalent pipeline. Additionally, we explore improving these summaries by incorporating longitudinal observations and additional genetic data as part of the evaluation. 5,6

Methods:

Our goal is to evaluate the analytic flexibility of the IDPs by replicating them with an equivalent pipeline. We will then explore their longitudinal stability and how integrating genetic features in their evaluation may improve their interpretation. The data used for the original quality assessments is available as part of the IDPs released by the UKB. The UKB currently has genetic information available for the majority of the sample (~488k) and neuroimaging data available for the roughly half of the participants that will receive it at time one (~48k) and an initial release of time two (~4k). Using TractoFlow7 we have produced many of the same IDPs distributed by the UKB. Additionally, we incorporate VIPRS8, a summary statistics based polygenic risk score to estimate the genetic features associated with change in the IDPs over time.

Results:

The comparison of the IDPs reveals that the majority of the features are quite similar (R2 > 0.90). This indicates that these features are largely stable for the use of quality assessment. We want to further explore the quality of these features. To do this, we have modeled each IDP with VIRPS to estimate the genetic contribution to these regions. The VIRPS model takes GWAS summary statistics and estimates a polygenic risk score for an individual imaging phenotype. A key impact of this analysis is to identify an estimate of the variance explained and the heritability by the PRS predictions of IDPs. Additionally, we specifically modeled the change over time of the IDPs to emphasize the genetic contributions associated with longitudinal changes. We believe this will better emphasize the genetic features associated with disease by highlighting their presence in individuals with greater change.

Conclusions:

Large population samples are key to facilitating the reliable translation of research findings to clinical practice. Working to maximize the information available to researchers between the modalities of these samples will improve the assessment and validation of data within these databases. We have worked to replicate the existing findings of the UKBB IDPs and have begun exploring their stability longitudinally. Further, we evaluate the performance of VIPRS on the estimation of polygenic risk scores of these longitudinal IDPs. The improvements of modeling a more complex, longitudinal IDPs with VIPRS estimates will better target features that reliably capture the variability in our samples. Following this approach we can better generalize findings to new samples and make better predictions in clinical cohorts. This new information can expand our discussion beyond the generalizability of the features to begin exploring the most effective multimodal features for validating our data.

Genetics:

Genetic Association Studies

Lifespan Development:

Aging

Modeling and Analysis Methods:

Methods Development 1

Neuroinformatics and Data Sharing:

Databasing and Data Sharing

Novel Imaging Acquisition Methods:

Multi-Modal Imaging 2

Keywords:

Computational Neuroscience
Informatics
Machine Learning
Modeling
MRI
Open-Source Software
Phenotype-Genotype
Statistical Methods
STRUCTURAL MRI
WHITE MATTER IMAGING - DTI, HARDI, DSI, ETC

1|2Indicates the priority used for review

Provide references using author date format

Argiris, G., et al. (2021). Quantifying age-related changes in brain and behavior: a longitudinal versus cross-sectional approach. Eneuro, 8(4).
Sha, Z., et al. (2023). Genetic architecture of the white matter connectome of the human brain. Science Advances, 9(7).
Poldrack, R. A., et al. (2017). Scanning the horizon: towards transparent and reproducible neuroimaging research. Nature reviews neuroscience, 18(2), 115-126.
Bycroft, C., et al. (2018). The UK Biobank resource with deep phenotyping and genomic data. Nature, 562(7726), 203-209.
Smith, S. M., Douaud, G., Chen, W., Hanayik, T., Alfaro-Almagro, F., Sharp, K., & Elliott, L. T. (2021). An expanded set of genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature neuroscience, 24(5), 737-745.
Yang, Xiaochen, et al. "Developing and sharing polygenic risk scores for 4,206 brain imaging-derived phenotypes for 400,000 UK Biobank subjects not participating in the imaging study." medRxiv (2023): 2023-04.
Theaud, G., et al. (2020). TractoFlow: A robust, efficient and reproducible diffusion MRI pipeline leveraging Nextflow & Singularity. Neuroimage, 218, 116889.
Zabad, S., et al. (2023). Fast and accurate Bayesian polygenic risk modeling with variational inference. The American Journal of Human Genetics, 110(5), 741-761.