An Association Test for Integrating Transcriptome and Genomic Data
Combining expression quantitative trait loci information between different tissues can improve the ability to identify potential disease predispose genes and help understand gene regulation mechanism.MultiXcan method was constructed via a multivariate regression on predicted transcriptome from multiple tissues of each gene by principal component analysis.However,due to the small contribution of each gene to complex traits,it is likely to be inefficient in the case of limited sample size.Here we propose a promising method that integrates the predicted gene expression of all genes across different tissues to detect gene associations.Principal component analysis is performed for the predicted expression of each gene in different tissues.In order to select significant relevant principal components,we adopt Lasso method.Then,multivariate linear regression model is constructed for the selected principal components,and the association between each gene and phenotype is detected by Wald test and FDR correction.Numerical results show that our method outperforms MultiXcan in most cases,suggesting that integrating gene expression data in different tissues indeed improves the ability to identify the associated genes.Especially,the sample size that our method requires to gain a high power is much smaller than MultiXcan.