In SRF's schizophrenia genetics overview, writer Pat McCaffrey surveys the range of experimentation and opinion in the field in a five-part series.
See Part 1, Linkage; Part 3, CNVs, Part 4, Bigger Genetics, Part 5, From Genes to Biology…and Therapies. Read a PDF of the entire series.
16 March 2010. The study of schizophrenia genetics, like all of human genetics, has been driven forward in large part by successive technological advances. While early linkage techniques were sufficient to locate the single-gene causes of Mendelian disorders, more power was needed to parse the causes of diseases with more complex patterns of inheritance. The Human Genome Project opened the door to high-throughput, factory-style sequencing and genetic analysis. After its completion in 2003, and the first haplotype map in 2005, the floodgates opened to a torrent of new data on human genetic variation.
These new methods gave researchers exceptional power to investigate the genetic variation that underlies individual risk for complex diseases. Family-based studies were out; association studies using case-control designs and hundreds to thousands of unrelated subjects were in. The assumption underlying genomewide association studies (GWAS) is that common gene variants (that is, those present in more than about 5 percent of people) contribute in small but significant ways to risk of disease in a population (the "common disease/common variant" hypothesis as proposed by Reich and Lander, 2001). The variants are marked by single nucleotide polymorphisms (SNPs); by testing enough SNPs in enough people, these common, low-risk variants can be found.
These days, a typical GWAS involves genotyping thousands of patients and controls using chips covering half a million to a million common SNPs. Then, the search is on for genotypes that differ in frequencies between control and disease groups. Because of the massive number of comparisons that are done, stringent statistical treatment is required to weed out false positives, and the accepted threshold for genomewide significance in SNP frequency is a P value less than 5 x 10-8. Effect sizes in these studies tend to be small—in other complex diseases like diabetes or heart disease, highly significant SNPs increase risk of disease by a few percent, and finding these small effects has required study sizes of 8,000 to 20,000 cases, plus their controls (for example, see reviews by Arking and Chakravarti, 2009 on cardiovascular disease, or Kronenberg, 2008 on diabetes, atherosclerosis, and cancer).
The early returns: little in common?
For diseases of unknown etiology like schizophrenia, GWAS are attractive because they are agnostic. Unlike candidate gene studies, GWAS do not depend on prior knowledge or guesses regarding which genes might be important. In contrast to linkage studies, which require collecting data from affected families, GWAS can be done in a case/control design with unrelated subjects taken from the general population.
Nonetheless, the early GWAS data in schizophrenia were disappointing. In the first seven genomewide studies, ranging in size from a few hundred to a few thousand patients, not one reported any statistically significant SNPs for schizophrenia alone (for a thorough review of the studies preceding the most recent GWAS, see Owen et al., 2009). Furthermore, promising SNPs in one study often failed to replicate in other studies.
In the largest study (total of 6,286 cases and 12,993 controls), one SNP in the zinc finger gene ZNF804A, a putative transcription factor, did achieve significance when schizophrenia and bipolar patients were analyzed together (O’Donovan et al., 2008; see SRF related news story). As expected, the effect size was quite small (odds ratio 1.09, or about a 9 percent increase in risk). However, coauthor Michael Owen of Cardiff University, Wales, said that this study established the proof of principle of GWAS in schizophrenia. “There are many more risk genes that we’ll find, and that’s been proven by this approach,” he told SRF.
Then, in July 2009, a trio of long-awaited GWAS, each analyzing between 2,600 to 3,300 cases, was published online in Nature. None of the three detected any genomewide significant SNPs, but one group’s meta-analyses of the top SNPs in 8,000 patients and more than 19,000 controls pointed to a region of chromosome 6. The region, which includes genes of the major histocompatibility locus, has been previously implicated in linkage studies (Lewis et al., 2003). The studies also found evidence to support the ZNF804A SNP identified by O’Donovan and colleagues.
The response to the studies was mixed. Some called the exercise a failure; others saw it as an advance. In comments to SRF, Daniel Weinberger of the National Institute of Mental Health in Bethesda, Maryland, termed the outcome “decidedly disappointing,” while in a press conference, study coauthor Michael O’Donovan of Cardiff University, Wales, called the work “a significant increment in knowledge” over the zinc finger gene SNP (see SRF related news story and complete text of Weinberger comment, as well as other comments on the subject). The mixed reviews reflect an ongoing and complex controversy, which stems as much from funding questions—should large sums be directed to GWAS at the expense of other types of research?—as from scientific debate about the contribution of common gene variants to schizophrenia and other mental illnesses. (Part 4 of this series will take up that question in more detail.)
A new picture
What was clear from those studies was that, with thousands of patients recruited and billions of SNPs genotyped, no single gene or allele was likely to contribute a great deal to the risk of disease in a large number of people. “If there was a truly common variant, the technique would have found it,” said David Porteous, of the University of Edinburgh in the United Kingdom. He added, “If you were looking for a single, defining marker in the general population of individuals with schizophrenia, there isn’t one. That is for certain now.”
Instead, the evidence points to many genes—hundreds and perhaps even thousands—that each contributes a small amount to the cause of schizophrenia. The data to date suggest that common variants account for just a fraction of the risk of schizophrenia in a population. Estimates of that fraction range from a low of 4 percent (Purcell et al., 2009) to a generous 30 percent, says Porteous; there is no wide agreement on the number. Writing about the GWAS published in Nature in 2009, Kevin Mitchell, Trinity College, Dublin, opined on SRF that, “Based on the meager haul of common variants dredged up by these three studies and their forerunners, this [common variants] hypothesis should clearly now be resoundingly rejected” (see full text of Mitchell’s comment on SRF related news story).
Richard Straub of the National Institute of Mental Health, Bethesda, Maryland, put it more gently. “The common disease, common variant hypothesis has really not panned out all that well, probably due in large part to the lack of any serious attempts in GWAS publications to examine haplotypes and to model gene-gene and gene-environment interactions, which we know are tremendously important in the architecture of risk,” he said. But, he allowed, “The complexity is deep—it is a puzzle where you don’t know how best to solve it overall until you’ve learned from solving some of the components.”
Glass half empty or half full?
Others argue that GWAS has already led to significant insights. Kenneth Kendler, of Virginia Commonwealth University in Richmond, thinks the failure to find truly common risk factors turns out to be one of the more informative results from GWAS, because it reveals something about the genetic architecture of schizophrenia. “It is telling us about a deep background polygene signal,” he said. That means that estimating individual risk will not be a matter of adding up five or 10 risk SNPs into a genetic profile. “The polygene background is a completely different ballgame,” Kendler said. He likes the approach of Purcell and colleagues, who calculated that the top half of all positive SNPs in aggregate explain about 30 percent of the risk of schizophrenia. Although these are extremely early days, Kendler said, that type of aggregation appears to do a much better job at describing risk.
In a written comment on SRF, Nick Craddock, Cardiff University, Wales, and colleagues O’Donovan and Owen outlined major gains that they believe have accrued from GWAS in the last two years. These include evidence for at least four loci in schizophrenia (ZNF804A, MHC, NRGN, TCF4), (O’Donovan et al., 2008a; ISC, 2009; Shi et al., 2009; Stefansson et al., 2009) and two novel loci in bipolar disorder, ANK3 and the calcium channel gene CACNA1C (Ferreira et al., 2008). At least one of the latter (CACNA1C) also reportedly confers risk of schizophrenia (Green et al., 2009).
About these findings, Craddock, O’Donovan, and Owen write, “This is obviously a small part of the picture, but it is certainly better than no picture at all.” They add that these results offer “a much more secure foundation than the earlier findings upon which to build follow-up studies,” such as brain imaging, cognitive phenotype (Esslinger et al., 2009), and even candidate gene studies. "We would not regard the first convincing evidence that altered calcium channel function is a primary etiological event in at least some forms of psychosis as a trivial gain in knowledge,” Craddock and colleagues write.
Adding to the list, the researchers cite other insights gleaned from GWAS, including the discovery of an important role in schizophrenia for copy number variations, rare structural changes that affect gene dosage. In addition, GWAS, along with other genetic and epidemiological studies, have generated clear evidence for a high degree of genetic risk-sharing between schizophrenia and bipolar disease, and to a lesser extent, schizophrenia and autism (see SRF related news story; SRF news story; SRF news story). “As clinicians, we do not regard this knowledge as a trivial achievement,” Craddock, O’Donovan, and Owen write.
The reason that GWAS had not turned up more genes, some argued, was simply that the right experiments had not yet been done. The ability to detect common variants depends on sample size: the smaller the effect size, the larger the number of cases and controls needed to get a statistically compelling result. When they published their study fingering ZNF804A, O’Donovan and colleagues estimated that it may take a study of 30,000 patients and an equal number of controls to capture the majority of low-effect risk alleles for psychiatric disorders (Owen et al., 2009). After seeing the 2009 GWAS data, David Collier of the Institute of Psychiatry, London, United Kingdom, upped the number to 100,000 of each to capture statistical significance for many of the hundreds (if not thousands) of common schizophrenia susceptibility alleles of small effect (see SRF related news story).
Figure 1. Psychiatric GWAS Consortium Overview, March 2010. Image credit: P. Gejman. View larger image.
Bigger studies, more genes
Where is the field in relation to those numbers? In 2007, Patrick Sullivan of the University of North Carolina, Chapel Hill, and other researchers in the United States spearheaded the formation of the Psychiatric GWAS Consortium to pool samples for meta-analysis (for an overview, see Psychiatric GWAS Consortium Coordinating Committee et al., 2009). At last report, the consortium included 165 investigators from 68 institutions in 19 countries who are studying schizophrenia, bipolar disorder, major depressive disorder, autism, and attention-deficit hyperactivity disorder (see Figure 1 for overview). Pablo Gejman of North Shore University Health System and Northwestern University, Evanston, Illinois, presented the first results of the consortium’s schizophrenia genomewide analysis last November at the World Congress of Psychiatric Genetics in San Diego.
In that effort, boosting the sample size (to 12,200 cases and 9,300 controls) increased the yield of statistically significant SNPs to 129, spread over six regions of the genome (see SRF related news story). Gejman stressed that the results are preliminary, but they did replicate associations in the HLA locus on chromosome 6 and identified five other loci. The effect sizes (odds ratios) fall in the 1.1-1.2 range. Gejman said replication is ongoing with a new set of 20,000 samples from the SGENE Consortium, a group of European labs who have pooled their cohorts of schizophrenia subjects for GWAS analysis (Stefansson et al., 2009).
The Psychiatric GWAS Consortium results raised confidence among some researchers that conducting large GWAS is paying off. After seeing the consortium data, Anil Malhotra of Zucker Hillside Hospital in Glen Oaks, New York, told SRF, “I was not an initial believer that this was going to work out that well, but the HLA findings and ZNF804A findings are more convergent than I would have originally imagined. They are replicating in many samples, and the samples are very large, albeit with relatively modest effects sizes.” Malhotra credited a combination of improved technology and larger samples for the progress.
Building on that progress, researchers are continuing to collect more patients. Pamela Sklar, of the Broad Institute of the Massachusetts Institute of Technology and Harvard University, said she is involved in collaborations with a goal of genotyping 10,000 to 20,000 additional patients with schizophrenia or bipolar disorder, as well as matched controls. But what is the ultimate goal? At the World Congress, Sullivan presented a power calculation where a total sample size of 50,000 (that is, 25,000 cases and 25,000 controls) would allow for identification of more than 95 percent of all reported GWAS findings across all biomedical disorders.
However, Sullivan told SRF, “How many GWAS cases is enough is exceptionally complex, and the answer you get depends crucially on your assumptions. There is no unique and definitive answer. Using other medical disorders as a crude guide, sample sizes in excess of 50,000 subjects were needed before sensible and consistent results emerged for common variant associations. Schizophrenia could well be more complicated than other medical disorders. Other scenarios could require 100,000 subjects, particularly if you want to make a serious effort to detect gene-environment interactions.”
Sullivan pointed out that rapid technological development adds an additional level of complexity. “The next generation of GWAS arrays and genome and exome sequencing technologies will allow unprecedented genomic resolution. The technology and pricing are evolving rapidly. If GWAS pricing continues its sharp decline, costs per subject could be far lower than for functional MRI or even for a panel of endocrine biomarkers.”
Others say they have seen enough GWAS to know that more and larger sample sizes are not the answer. “One reason why the GWAS [approach] has not been very informative to date in schizophrenia may be that it is the wrong strategy,” said Straub’s National Institute of Mental Health colleague Daniel Weinberger. He sees two key faults with the approach: it does not take into account the biologic complexity of the disease or the diversity of pathology that falls under the clinical diagnosis of schizophrenia.
“The GWAS analytic model assumes fixed, predictable relationships between genetic risk and illness, but simple relationships between genetic risk and complex pathophysiological mechanisms are unlikely,” Weinberger said. In addition, the approach assumes that the diagnosis of schizophrenia represents a single biological entity, but most clinical diagnoses are more complicated “Other approaches, e.g., family studies, studies of smaller but much better characterized samples, and studies of genetic interactions in these samples, will be necessary to understand the variable genetic architectures of such biologically complex and heterogeneous disorders.”
Malhotra thinks that the GWAS approach has nearly run its course in schizophrenia. “In an interesting twist, it appears that the much anticipated GWAS era will be quite short-lived, with perhaps a lifespan of less than five years in psychiatric genetics,” he said. “Our group and many others are now focusing on next-generation genome sequencing, first aimed at the exome, but then rapidly transitioning to the entire genome. I don't think it will be too long before we start to see the first data from some of these studies.”—Pat McCaffrey.
See Part 1, Linkage; Part 3, CNVs, Part 4, Bigger Genetics, Part 5, From Genes to Biology…and Therapies. Read a PDF of the entire series.v