RNA Sequencing Helps Identify Functional Variants from GWAS
Adapted from a story that originally appeared on the Alzheimer Research Forum.
October 2, 2013. For Alzheimer’s and other complex disorders, mining the genome for disease-associated variants is no longer the obstacle. The challenge nowadays is figuring out how the identified loci relate to disease. As reported last month in Nature and its associated journals, advances in high-throughput RNA sequencing are providing new tools for understanding how disease loci influence gene expression—a starting point for understanding their connection to pathogenesis.
In a September 16 Nature paper, researchers led by Emmanouil Dermitzakis and Tuuli Lappalainen at the University of Geneva, Switzerland, report the largest ever RNA-sequencing study of multiple human populations with sequenced genomes. Their analysis suggests it may now be possible to predict from scores of disease-associated loci which ones actually drive up disease risk rather than merely correlate with it. In a companion paper published September 15 in Nature Biotechnology, the Geneva scientists, with collaborators elsewhere, outlined quality-control measures that enable RNA sequencing to be done reliably across multiple labs. This was critical for their recent effort and future large-scale studies. And in the September 8 Nature Genetics, a meta-analysis by Lude Franke of the University of Groningen, the Netherlands, and colleagues shows it is feasible, with a large enough dataset, to identify a special class of polymorphisms called trans expression quantitative trait loci (eQTL). These variants associate with gene expression levels but have been hard to map because they act from unpredictably long distances, often from different chromosomes. Together, the new studies highlight the power of RNA sequencing to interpret data from genomewide association studies (GWAS).
The use of gene expression profiles to analyze genomewide datasets is not new. However, prior efforts had limited coverage because they used microarrays that contain a limited number of known single-nucleotide polymorphisms (SNPs) (e.g., Stranger et al., 2007; Montgomery et al., 2011). Others that did analyze sequencing data had a small sample size of 60-70 people at most (e.g., Pickrell et al., 2010; Montgomery et al., 2010). For the Nature study, first author Lappalainen—who is now at Stanford University School of Medicine in Palo Alto, California—and colleagues in the Genetic European Variation in Health and Disease (GEUVADIS) Consortium used similar methods but beefed up both quantity and quality. The team sequenced mRNA and microRNA (miRNA) of lymphoblast cell lines from 462 people with fully sequenced genomes. The samples came from Finnish, British, Italian, Nigerian, and U.S. cohorts in the 1000 Genomes dataset, 89-95 per group. Seven labs performed the RNA sequencing, and analyses reported in the Nature Biotechnology paper showed the methods were reproducible across sites.
The study found that the human genome is chock full of eQTLs. “Over half the genes we measured have an eQTL,” Lappalainen told Alzforum. Moreover, 16 percent of known GWAS variants are themselves eQTLs. These statistics indicate there is a good chance that some variants will correlate with gene expression without a functional connection to the disease. “You may find many associating polymorphisms, but probably only one is the causal variant. One of our goals here was to try and tease them apart,” Lappalainen said. She said the current study shows how eQTL data can help map causal variants—something that was not possible in prior studies that used microarrays covering only a subset of common variants.
On the whole, the methods and general approach can be useful for neurodegenerative disease research, Carlos Cruchaga of Washington University School of Medicine, St. Louis, Missouri, commented in an email to Alzforum. However, he said it is unclear how well the results themselves will translate because the analyses focused on gene expression in transformed blood cells, not brain tissue. Mark Cookson of the National Institute on Aging, Bethesda, Maryland, expressed a similar concern. “While some of the genes you would be interested in from an AD perspective are also expressed in blood,” he said, “there are more difficult cases. I don’t think there is much tau expression in blood. It may be hard to reliably assess what is going on at the tau locus from a lymphoblastoid cell line.”
Then how about doing this work in brain? The brain’s regional heterogeneity, as well as postmortem artifacts such as acidosis, make it “noisy” for genotyping and sequencing analyses, Cookson said. Nevertheless, he and other scientists have teamed up to analyze gene expression in human brain tissue collected by the North American Brain Expression Consortium (NABEC) and the UK Brain Expression Consortium (UKBEC) (see Colantuoni et al., 2011). The scientists have used the datasets to look at genetic and age effects on epigenetics and gene expression. Several studies have directly compared blood and brain and found some similar and some distinct effects (see Hernandez et al., 2012; Kumar et al., 2013). So far, the brain analyses have been array-based but they are “moving into sequencing-based platforms,” Cookson said.—Esther Landhuis.
Lappalainen T, Sammeth M, Friedländer MR, 't Hoen PA, Monlong J, Rivas MA, Gonzàlez-Porta M, Kurbatova N, Griebel T, Ferreira PG, Barann M, Wieland T, Greger L, van Iterson M, Almlöf J, Ribeca P, Pulyakhina I, Esser D, Giger T, Tikhonov A, Sultan M, Bertier G, MacArthur DG, Lek M, Lizano E, Buermans HP, Padioleau I, Schwarzmayr T, Karlberg O, Ongen H, Kilpinen H, Beltran S, Gut M, Kahlem K, Amstislavskiy V, Stegle O, Pirinen M, Montgomery SB, Donnelly P, McCarthy MI, Flicek P, Strom TM; Geuvadis Consortium, Lehrach H, Schreiber S, Sudbrak R, Carracedo A, Antonarakis SE, Häsler R, Syvänen AC, van Ommen GJ, Brazma A, Meitinger T, Rosenstiel P, Guigó R, Gut IG, Estivill X, Dermitzakis ET, Palotie A, Deleuze JF, Gyllensten U, Brunner H, Veltman J, Cambon-Thomsen A, Mangion J, Bentley D, Hamosh A. Transcriptome and genome sequencing uncovers functional variation in humans. Nature. 2013 Sep 15. Abstract
't Hoen PA, Friedländer MR, Almlöf J, Sammeth M, Pulyakhina I, Anvar SY, Laros JF, Buermans HP, Karlberg O, Brännvall M; The GEUVADIS Consortium, van Ommen GJ, Estivill X, Guigó R, Syvänen AC, Gut IG, Dermitzakis ET, Antonorakis SE, Brazma A, Flicek P, Schreiber S, Rosenstiel P, Meitinger T, Strom TM, Lehrach H, Sudbrak R, Carracedo A, 't Hoen PA, Pulyakhina I, Anvar SY, Laros JF, Buermans HP, van Iterson M, Friedländer MR, Monlong J, Lizano E, Bertier G, Ferreira PG, Sammeth M, Almlöf J, Karlberg O, Brännvall M, Ribeca P, Griebel T, Beltran S, Gut M, Kahlem K, Lappalainen T, Giger T, Ongen H, Padioleau I, Kilpinen H, Gonzàlez-Porta M, Kurbatova N, Tikhonov A, Greger L, Barann M, Esser D, Häsler R, Wieland T, Schwarzmayr T, Sultan M, Amstislavskiy V, den Dunnen JT, van Ommen GJ, Gut IG, Guigó R, Estivill X, Syvänen AC, Dermitzakis ET, Lappalainen T. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat Biotechnol. 2013 Sep 15. Abstract
Westra HJ, Peters MJ, Esko T, Yaghootkar H, Schurmann C, Kettunen J, Christiansen MW, Fairfax BP, Schramm K, Powell JE, Zhernakova A, Zhernakova DV, Veldink JH, Van den Berg LH, Karjalainen J, Withoff S, Uitterlinden AG, Hofman A, Rivadeneira F, 't Hoen PA, Reinmaa E, Fischer K, Nelis M, Milani L, Melzer D, Ferrucci L, Singleton AB, Hernandez DG, Nalls MA, Homuth G, Nauck M, Radke D, Völker U, Perola M, Salomaa V, Brody J, Suchy-Dicey A, Gharib SA, Enquobahrie DA, Lumley T, Montgomery GW, Makino S, Prokisch H, Herder C, Roden M, Grallert H, Meitinger T, Strauch K, Li Y, Jansen RC, Visscher PM, Knight JC, Psaty BM, Ripatti S, Teumer A, Frayling TM, Metspalu A, van Meurs JB, Franke L. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat Genet. 2013 Sep 8. Abstract