Schizophrenia Research Forum - A Catalyst for Creative Thinking

Newly Mapped DNA Elements Help Interpret GWAS

Adapted from a series that originally appeared on the Alzheimer Research Forum.

This is Part 2 of a two-part story. See also Part 1.

24 September 2012. With the advent of inexpensive genotyping technology, genomewide association studies (GWAS) have turned up thousands of point changes in DNA that can alter risk for disease. Most of those “hits” appear in genomic regions that code for no particular gene, leaving researchers puzzled about how they exert their influence. Now, scientists led by John Stamatoyannopoulos, University of Washington, Seattle, provide some clues. The researchers correlated newly catalogued functional regions in the human genome with published GWAS. They found that, not only did the majority of reported variants fall in regulatory DNA that controls gene expression, but they do so only in cells linked to pathology. That suggests these variants could indirectly influence coding genes that then go on to affect disease. "It basically says we've only been looking at the tip of the iceberg with these GWAS," said Stamatoyannopoulos. The findings, reported in the September 7 Science, come hot on the heels of a slew of related papers in Nature, revealing new insights into functional elements in the human genome (see Part 1). Together, the work could help researchers interpret some seemingly obscure links between genetic polymorphisms and neurodegenerative diseases.

GWAS of diseases, or clinical traits such as plasma cholesterol, look for genomic changes that could explain the phenotype at hand. Some studies reported GWAS hits in regulatory regions of the genome (see, e.g., Pomerantz et al., 2009 and Musunuru et al., 2010), but it was unclear whether the observation was specific to those particular diseases or true in general. "We showed that it is really across the board. Every single disease or trait we looked at shows the same phenomenon," said Stamatoyannopoulos.

To broadly examine regulatory regions, joint first authors Matthew Maurano, Richard Humbert, and Eric Rynes constructed an overall genomic map by analyzing 349 cell and tissue types—including fetal tissues, tumor cells, pluripotent cells, hematopoietic cells, and cultured cells—as part of the Encyclopedia Of DNA Elements Project (ENCODE) and the Roadmap Epigenomics Program. Both of these projects probe the finer points of genomic function. Maurano and colleagues cut the DNA into pieces with nuclease DNase1. This enzyme preferentially snips DNA that has been exposed, such as when regulatory proteins unwind the nucleic acid to activate gene transcription. After the DNA had been cut, the research group analyzed the hundreds of millions of DNA fragments to pinpoint the DNase1-hypersensitive sites (DHSs).

Almost four million DHSs pervade the genome, the team found. On average, about 200,000 DHSs were active in any one cell. Each type of cell had a unique DHS pattern, depending on which areas of DNA were active. With this DHS map in hand, the team looked at more than 5,500 single-nucleotide polymorphisms (SNPs) found in GWAS of hundreds of diseases and quantitative traits. About 75 percent of these were in or near DHSs, suggesting that a considerable number of non-coding GWAS hits are functional and exert effects on coding genes through regulation, the authors wrote.

Probing further, the authors found that specific SNP-containing genome regions often sported DHSs only in disease-relevant cells. As an example, a DHS popped up in a fetal heart cell (but not a brain cell) around a coronary heart disease-associated mutation. In a few hundred cases, DHSs harboring GWAS-related variants seemed to control distant genes, up to 500 kilobases—that is, several genes—away. In addition, seemingly unrelated SNPs that had previously been linked to individual diseases within a family of disorders (such as autoimmune disease) often struck elements that were recognized by the same transcription factor. "[That correlation] allows you to construct relationships among diseases in ways that nobody had previously anticipated," said Stamatoyannopoulos.

About 88 percent of non-coding SNPs lay in DHSs active in early fetal development. Most of the diseases or traits tied to these SNPs are thought to start in the womb or are influenced by development. The remaining SNPs, including some related to Alzheimer's disease, breast cancer, and lupus, occurred in DHSs found only in adult tissues. The majority of those diseases and traits have not previously been associated with any early causation. "This would suggest that pathology likely begins in the adult stage," said Stamatoyannopoulos. Researchers studying AD debate how early in life the disease begins to take hold. Though the researchers possess scant data so far on DHSs encompassing AD-linked SNPs, Stamatoyannopoulos hopes to sample adult brains in the coming year.

"These results are potentially very important to people carrying out GWAS in neurodegenerative disorders," wrote Peter Holmans, Cardiff University School of Medicine, U.K. in an e-mail to Alzforum. Not only will these findings help weed out true GWAS signals, but "they will have profound implications for pathway analyses, both on the way that GWAS SNPs are assigned to genes and how genes are grouped into pathways."

The study adds a level of support to an idea that scientists have held for some years. It is that several genes, each imparting a subtle effect on biology, may cause a disease, rather than one mutation causing its own profound effect, said Julie Williams, also at Cardiff University. "It also tells us that there's yet another layer of complexity, that [DNA] regions may actually affect genes that are some distance away, not necessarily the most proximal gene."

If this study can be validated, researchers may look specifically for mutations in the regulatory DNA in GWAS, increasing statistical power and the ability to detect disease-associated variants, wrote Christiane Reitz, Columbia University, New York, to Alzforum in an e-mail (see full comment below).

In addition to studying the human brain, the Seattle group will sample other tissues associated with human disease, expand the range of GWAS hits they explore, and continue to improve the technology. "The ultimate goal of this line of research is to have a complete map of the regulatory circuitry of the human genome," Stamatoyannopoulos said. With such a map, researchers could understand how the genome controls everything from normal developmental processes to disease.—Gwyneth Dickey Zakaib.

This is Part 2 of a two-part story. See also Part 1.

Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J, Shafer A, Neri F, Lee K, Kutyavin T, Stehling-Sun S, Johnson AK, Canfield TK, Giste E, Diegel M, Bates D, Hansen RS, Neph S, Sabo PJ, Heimfeld S, Raubitschek A, Ziegler S, Cotsapas C, Sotoodehnia N, Glass I, Sunyaev SR, Kaul R, Stamatoyannopoulos JA. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012 Sep 7;337(6099):1190-5. Abstract

Schadt E, Chang R. A GPS for navigating DNA. Science. 2012 Sep 7;337(6099):1179-80. Abstract

Comments on News and Primary Papers

Primary Papers: Systematic localization of common disease-associated variation in regulatory DNA.

Comment by:  Patrick Sullivan, SRF Advisor
Submitted 12 September 2012
Posted 12 September 2012

This Science paper has bearing on the genomic basis of complex traits, including schizophrenia, autism, and bipolar disorder(Maurano et al., 2012). A related paper in Nature will be of great interest to genomicists (ENCODE Project Consortium, 2012)

A major quandary in human genetics is how to understand the findings of genome-wide association studies for complex traits. This body of knowledge is now pretty huge: the NHGRI GWAS catalog (downloaded 22 June 2012, filtered for p < 1x10-8 and keeping the smallest p value if there were multiple SNP-trait pairs) contains genome-wide significant results for 2,441 SNPs, 385 traits, and 2,968 SNP-trait pairs from 672 papers. These associations are common (median minor allele frequency 0.29) and of subtle effect (median genotypic relative risk 1.19).

The diseases with the greatest number of associations are: Crohn's disease (94), ulcerative colitis (56), type 2 diabetes mellitus (56), type 1 diabetes mellitus (47), coronary artery disease (47), prostate cancer (45), and multiple sclerosis (42). Schizophrenia has 9 in this database, but current number is larger (see Sullivan et al., 2012) and, work in progress will push this number much higher.

The continuous traits with the greatest numbers of associations are: height (134 SNP associations), lipids (cholesterol HDL=75, LDL=60, total=58, and triglycerides=57), and body mass index (55).

At the same time, searches for rare, more actionable exonic variants have not yielded many hits, per 3 sizable studies of autism in Nature earlier this year (Sanders et al., 2012; O’Roak et al., 2012; Neale et al., 2012) , a recently published but smallish study for schizophrenia (Need et al., 2012), and, for what it's worth, unpublished results for T2DM. Several larger studies for schizophrenia are either in analysis or manuscript preparation, so the database for schizophrenia is not yet extensive.

One quandary with the GWAS results -- the major etiological clues for most complex disorders -- is that > 90% of these associations are not in regions that code for proteins. How does genetic variation act? What's the next step in the progression from DNA variation to ultimate disease phenotype?

The Science paper tells us that 76% of the GWAS hits above are either in or in high correlation with "DNase I hypersensitivity sites". Such sites tend to be stretches where DNA is more or less bare - unfolded, not pretzeled-up in histones, and open to transcription factors.

In addition, other work tells us that the GWAS hits tend to be "eQTLs" (expression quantitative trait loci, genetic variation that is strongly associated with the amounts of RNA from nearby genes).

Taken together, these studies provide a testable and now highly plausible general hypothesis: genomic regions implicated by GWAS act by regulating gene expression. If so, there is a distinct role for epigenetics.

Careful readers of SRF will have followed this debate (aka the rare "versus" common variant false dichotomy). The final word on this has yet to be written, but it is looking like the answers for complex traits like schizophrenia will be more related to subtle changes in expression in pathways than protein-killing mutations.



Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M. An integrated encyclopedia of DNA elements in the human genome. Nature . 2012 Sep 6 ; 489(7414):57-74. Abstract

Sullivan PF, Daly MJ, O'Donovan M. Genetic architectures of psychiatric disorders: the emerging picture and its implications. Nat Rev Genet. 2012 Jul 10;13(8):537-51. Abstract

Sanders SJ, Murtha MT, Gupta AR, Murdoch JD, Raubeson MJ, Willsey AJ, Ercan-Sencicek AG, DiLullo NM, Parikshak NN, Stein JL, Walker MF, Ober GT, Teran NA, Song Y, El-Fishawy P, Murtha RC, Choi M, Overton JD, Bjornson RD, Carriero NJ, Meyer KA, Bilguvar K, Mane SM, Sĕstan N, Lifton RP, Günel M, Roeder K, Geschwind DH, Devlin B, State MW. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 2012 April 5. Abstract

O’Roak BJ, Vives L, Girirajan S, Karakoc E, Krumm N, Coe BP, Levy R, Ko A, Lee C, Smith JD, Turner EH, Stanaway IB, Vernot B, Malig M, Baker C, Reilly B, Akey JM, Borenstein E, Rieder MJ, Nickerson DA, Bernier R, Shendure J, Eichler EE. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature 2012 April 5. Abstract

Neale BM, Kou Y, Liu L, Ma’ayan A, Samocha KE, Sabo A, Lin CF, Stevens C, Wang LS, Makarov V, Polak P, Yoon S, Maguire J, Crawford EL, Campbell NG, Geller ET, Valladares O, Schafer C, Liu H, Zhao T, Cai G, Lihm J, Dannenfelser R, Jabado O, Peralta Z, Nagaswamy U, Muzny D, Reid JG, Newsham I, Wu Y, Lewis L, Han Y, Voight BF, Lim E, Rossin E, Kirby A, Flannick J, Fromer M, Shakir K, Fennell T, Garimella K, Banks E, Poplin R, Gabriel S, DePristo M, Wimbish JR, Boone BE, Levy SE, Betancur C, Sunyaev S, Boerwinkle E, Buxbaum JD, Cook Jr EH, Devlin B, Gibbs RA, Roeder K, Schellenberg GD, Sutcliffe JS, Daly MJ. Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 2012 April 5. Abstract

Need AC, McEvoy JP, Gennarelli M, Heinzen EL, Ge D, Maia JM, Shianna KV, He M, Cirulli ET, Gumbs CE, Zhao Q, Campbell CR, Hong L, Rosenquist P, Putkonen A, Hallikainen T, Repo-Tiihonen E, Tiihonen J, Levy DL, Meltzer HY, Goldstein DB. Exome sequencing followed by large-scale genotyping suggests a limited role for moderately rare risk factors of strong effect in schizophrenia. Am J Hum Genet . 2012 Aug 10 ; 91(2):303-12. Abstract

View all comments by Patrick Sullivan