Schizophrenia Genetics 4: Big Genetics Gets Bigger—Complete Sequencing Debuts
In SRF's schizophrenia genetics overview, writer Pat McCaffrey surveys the range of experimentation and opinion in the field in a five-part series.
23 March 2010. In the wake of the first round of genomewide association studies (GWAS), the field of schizophrenia genetics faces some new questions. If common variants account for one-third (or less) of the heritable risk of schizophrenia, spread over hundreds or more genes (the conclusion of the International Schizophrenia Consortium (International Schizophrenia Consortium, Purcell et al., 2009), where does the field go from here? Do the results dictate more, and larger, GWAS to identify additional genes of even smaller effect? What about the remaining unexplained variance—70 percent or more by some estimates? How much of that is explained by copy number variants or other rare events? How will researchers find what some call the “dark matter” of inherited schizophrenia risk?
When will GWAS yield to whole genomes?
For many researchers, whole genome sequencing is the obvious answer to those last questions. But even as complete sequencing moves beyond the genomes of celebrity scientists and a few anonymous individuals around the globe, some researchers argue for continuing the GWAS effort. They believe that identifying additional common variants, even if they account for only a tiny bit of risk, will be valuable in helping to illuminate the underlying biological pathways of disease. As Markus Nöthen, Bonn University, Germany, explains it, even if the odds ratio for a common allele is 1.001, “If it can be robustly shown to be associated with the disorder, and the variant is the marker for a specific gene or affects regulation of a specific gene, then obviously this gene contributes in an important manner to a pathway that is not redundant, and that is a crucial step in the development of the disorder.” Maybe the effect is small because that mutation confers a subtle functional difference, Nöthen notes. “For example, you have a variant…that leads to increased gene expression from 100 percent to 110 percent: the risk is relatively small, but that is simply reflecting the factual nature of the mutation and not the importance of this gene pertaining to disease development,” he added.
That has been borne out by experience in type 2 diabetes, according to Michael Owen of Cardiff University, Wales. Type 2 diabetes resembles schizophrenia in its degree of heritability and the importance of environmental factors. Common variants in a dozen genes that have been associated with the disease all have small effect sizes (odds ratios around 1.2, reviewed in Wolfs et al., 2009). Nonetheless, two genes with low-impact variants (PPARG and KCNJ11) are targets for successful therapies.
On the other hand, David Porteous, University of Edinburgh, United Kingdom, for one, would like to see more of an emphasis on biology now and less focus on large-scale, statistically driven genetic studies. “With scale-up, you can get solid statistical evidence, but that does not tell you how significant the effect is in terms of biology. We should also look in much more detail at smaller samples, down to single families at the sequence level, rather than skim the surface of much larger samples. We should ask the question, What is driving the signals we are seeing and how do we relate that to the biology?”
Daniel Weinberger of the NIMH agrees that larger GWAS will yield more single nucleotide polymorphism (SNP) associations that achieve statistical significance, but he asks, What will they mean? “Of the genes that survive this kind of rigorous statistical struggle to become significant, we don’t know how useful they ultimately will be in teasing out the basic biology of illness.” Because they are by definition common, they explain very little about any individual’s risk, he argues. Weinberger compares the situation to finding that having a driver’s license is a common factor predicting the risk of being in a car crash: true, but not very helpful.
Pamela Sklar, at the Broad Institute, Cambridge, Massachusetts, said that it is too early to pass judgment on the utility of GWAS in psychiatric disease. First, she said, only a small amount of the genomewide data that have been obtained for psychiatric disease has been published in the literature. “I think what is most important for our field is to avoid premature closure in the absence of a great deal of high-quality data. I think that we are just really scratching the surface,” Sklar said. Owen agrees, saying, “We need to see this genetic project through before we start rushing into hundreds and hundreds of functional experiments.”
The large studies are expensive, but it is an investment that will pay off in more than SNP associations, write Owen and colleagues in a recent comment on SRF. “Like that of other common familial diseases, the genetics of schizophrenia and bipolar disorder is a ‘mixed economy’ of common alleles of small effect and rare alleles of large and small effects,” they write. “Those who are concerned at the cost of collecting large samples for GWAS studies must bear in mind that the robust identification of both types of mutation will require similarly large samples; we will just have to get used to that fact if we want to make progress.” It is expensive, they write, but “those costs are trivial compared with the human and economic costs of psychotic disorders.”
And once the DNA is in the bank, and the data in the computer, they are available forever. “Critics should bear in mind that the GWAS data are not just there for the ‘headline’ genomewide findings, but that the data will be available to mine for years to come. The findings reported to date are based on only the simplest analyses,” Owens and colleagues write. Uncovering the genetic basis of schizophrenia is like restoring an old work of art, Owen said. “You can clean off a few little bits and pieces, but what we want to see is the whole picture. For that, he said, “more genetics is what we need. Let’s finish cleaning the painting. We need to do the groundwork now, with more genetic studies, and that will set up biological psychiatry for the next 50 years.”
In the end, there is only one way to get the complete picture, and that is to get the complete sequence. Ongoing resequencing studies that analyze genes of interest in many patients are a start, according to Nöthen. Looking at loci associated to date can reveal whether there are highly penetrant mutations present in these genes that confer higher risk. “They might be rare, but are still very important to understand the biology of the disorder,” he said. Deep sequencing of candidate genes in hundreds or thousands of individuals will identify rare disease alleles, whether they contain large or small deletions or insertions, point mutations, or rare SNPs.
In the longer term, projects like the 1,000 Genomes Project will reveal most of the uncommon SNPs (occurring in 1 to 5 percent of the population) and allow researchers to start association studies with rare SNPs. But the ultimate goal will be to sequence completely the genomes of several thousand people with schizophrenia and an equal number of controls.
How long will it take for the field to transform from genome scanning to wholesale sequencing? Thomas Lehner, chief of the Genomics Research Branch, National Institute of Mental Health, Bethesda, Maryland, told SRF that the process is already underway. With some of its share of the $10 billion that the National Institutes of Health received from the American Recovery and Investment Act (see SRF related news story), the NIMH has awarded two grants in the area of deep sequencing of mental illness genes. David Goldstein of Duke University, Durham, North Carolina, received $1.56 million to sequence the genomes of 100 people with schizophrenia and their relatives to identify rare, highly penetrant variants. Sklar is the principal investigator on a $3 million grant with the aim of collecting and analyzing whole genome sequences from 150 patients with schizophrenia, 150 with bipolar disorder, and 100 controls.
The projects are just a start, said Lehner. “I’m very excited because this is the time now that whole-genome sequencing for the first time at sufficient coverage is becoming economically feasible.” The different types of genome analysis will ultimately converge in whole-genome sequencing, which Lehner predicts will become a routine part of clinical medical research within a decade. “We can’t predict how quickly the cost will come down, but the writing is on the wall,” he said.
Drinking from the firehose
But will researchers be able to make sense of the flood of new data? Kenneth Kendler of Virginia Commonwealth University in Richmond fears they will not. “What scares me is that we’re ahead of our ability to analyze the data with GWAS. And now, when thinking about whole-genome sequencing, we’re taking a leap from a million tests to 3.2 billion. Do you realize the sample size you’ll need to test billions of p values and have a posterior probably of a 5 percent finding?” he said. “A hundred thousand cases and controls? People don’t want to think about that.”
“We’re going to go there, because that’s how science works. Whether anyone wanted to do it or not, people are going to collect the data,” Kendler said. “How much of a mess are we going to get into because we don’t understand how to interpret it? Probably a fair amount.”
Of course, while Kendler and his colleagues wade about in the big genetics flood waters, there will still be progress in other areas. Starting with DISC1 and other genes identified by the various methods we have discussed in our previous installments, multifaceted research efforts will attempt to turn significant findings into biological clarity, and perhaps even therapies, as discussed in the next, and final, article of this series.—Pat McCaffrey.