Schizophrenia Research Forum - A Catalyst for Creative Thinking

Largest GWAS Analysis to Date Offers Only Two New Candidate Genes

3 July 2009. Three papers appearing in this week's issue of Nature present the much anticipated results of genomewide association studies (GWAS) of schizophrenia, as well as meta-analyses of the three studies together. There are no break-out candidate genes, though there is support for previous linkage findings, several new candidates, as well as statistical modeling that supports the notion of genetic overlap between schizophrenia and bipolar disorder.

Perhaps surprisingly, none of the studies alone identified any genetic marker with significant association to the disease (by the commonly applied genomewide significance benchmark of p <5 x 10 -8). The meta-analyses pointed to three regions, particularly a large swath on the short arm of chromosome 6, which houses the major histocompatibility complex genes, among others. This complex region, fingered in linkage studies (see Lewis et al., 2003), has also been highlighted by various single candidate gene association studies (see SchizophreniaGene's Chromosome 6 compendium).

The lack of a large new crop of gene candidates is certain to rekindle the debate about the value of expensive large-scale GWAS versus other approaches (see below for historical notes), and while some observers will take these results as a failure of GWAS to deliver, Michael O'Donovan of Cardiff University in Wales, and coauthor of the International Schizophrenia Consortium paper, argues for a different interpretation. "Before today, you could count on the thumb of one hand the number of common variants that have been reliably identified in schizophrenia, so this is a significant increment in knowledge," he said at a press conference Wednesday at the World Conference of Science Journalism (WCSJ) in London. What is needed to follow up on these results is larger samples, O'Donovan argued. While this will be fairly expensive, "that's peanuts compared with the human and economic cost of ignorance about this disease," he said.

Daniel Weinberger of the National Institute of Mental Health, Bethesda, Maryland, sees it differently: While we had hoped this brute force, clinically agnostic strategy would have been more fruitful, it is clear that the assumption of loading more cases on an already exhausted strategy is likely to only add a few more very small effect genes to the already too small list of very small effect genes," Weinberger wrote in an e-mail to SRF. "We have to critically consider the realistic possibility that the genetic and pathophysiologic heterogeneity of the condition we call schizophrenia may not be well suited for this strategy which assumes a quasi-unitary disease entity as part of its basic experimental logic.

Although the three separate studies came to the same major conclusion about signals in the chromosome 6p21 region, they did report some different results. The study conducted by the SGENE consortium also found significant association in the combined sample near the neurogranin gene (NRGN) on 11q24.2 and for an intronic SNP in transcription factor 4 (TCF4) on 18q21.2.

And in their paper, the International Schizophrenia Consortium (ISC) deployed a modeling approach to try to estimate the number of genes involved in the disease. Their analysis, they report, supports the polygenic model of the disease whereby a common variation in many hundreds or even thousands of genes contributes to the disease, and also wherein a significant number of these are shared with bipolar disorder.

Difficult birth
In October 2005 (on the very weekend SRF was launched!), the World Congress on Psychiatric Genetics in Boston saw a contentious series of exchanges between the proponents of putting most of the genetic funding eggs into a couple of large GWAS baskets, on the one hand, and others who suggested a combination approach that also utilized smaller, multifaceted studies of candidate genes. These latter veterans of psychiatric genetics argued that their approaches had already yielded a considerable number of strong candidate genes found by fine-mapping positional candidate genes under linkage peaks, and they raised concerns that large GWAS would wash out signals at work in smaller populations.

The large-scale GWAS approach was ultimately selected by funders around the world, and great—and rapid—results were anticipated by many in the schizophrenia community. Thus, there was disappointment when the first round of reports failed to deliver either a few genes of large effect or many confirmed genes of small to medium effect (see SRF related news story; SRF news story; SRF related news story).

The first apparent success of the GWAS era was the suggestion that copy number variations (CNVs)—major disruptions in one or more genes in one or a few individuals—could account for schizophrenia in myriad different ways. However, not everyone agreed that this evidence was so clear, or that if it was a factor, that it accounted for a substantial percentage of schizophrenia risk. (see the lively discussion at SRF related news story; SRF news story; SRF related news story). Thus, the community was left anticipating the GWAS studies described in these papers, which looked at the contribution of common SNPs across the genome.

Combining datasets
In one of the papers in the current issue of Nature, the ISC, a multinational collaboration led by Pamela Sklar of the Broad Institute, Cambridge, Massachusetts, found no genes with genomewide significance in their sample of 3,322 cases and 3,587 controls of European origin. A similar result was obtained by the SGENE consortium (2,663 cases, 13,498 controls of European origin), led by Kari Stefansson of deCODE Genetics in Reykjavik, Iceland, and by the Molecular Genetics of Schizophrenia (MGS) study (European: 2,681 cases, 2,653 controls; African American: 1,286 cases, 973 controls), led by Pablo Gejman of NorthShore University HealthSystem and Northwestern University, Evanston, Illinois.

According to the papers, when the three groups combined the European samples of more than 8,000 cases and 19,000 controls, and applied varying methods of statistical analysis, their findings all converged on a swath at chromosome 6p21. Originally tagged as a region of interest more than 30 years ago (Smeraldi et al., 1976) and later supported by linkage studies (see Lewis et al., 2003), this stretch of DNA contains a host of major histocompatibility complex genes, coding for proteins involved in immune functions. This stretch also contains many other genes of other function (see Sanger Institute overview of chromosome 6).

The results on chromosome 6p21 might tantalize with the possibility of linking genetics to previous epidemiologic findings regarding schizophrenia and season of birth effects, autoimmune disease, and prenatal infection (see SRF related news story; SRF news story). However, as the Associated Press reported, Stefansson noted the presence of other, non-immune genes in this region, and warned, "It's guilt by association; it's not really a link."

Some different approaches and results
The ISC attempted to glean some information by combining the tens of thousands of markers that had even nominal (i.e., not statistically significant) association with the disease. With the understanding that this "polygenic score" would probably contain a vast majority of false positives, the statistics team led by Shaun Purcell of the Broad Institute nonetheless hoped it would allow them make some observations about the overall common genetic landscape of the disorder. In particular, they wanted to test whether the data supported the polygenic theory that hundreds or even thousands of genes can influence the risk of schizophrenia, notably advanced by Gottesman and Shields (1967). According to the authors, the resulting modeling, strongly supports "a polygenic basis to schizophrenia that 1) involves common SNPs, 2) explains at least one-third of the total variation in liability, 3) is substantially shared with bipolar disorder, and 4) is largely not shared with several non-psychiatric diseases."

The question of a genetic link between bipolar disorder and schizophrenia has been debated since the disease categories were created, and O'Donovan and colleagues at Cardiff University have applied a particular focus on this question (see SRF Live Discussion). Their research agenda was supported earlier this year by a large Swedish epidemiology study that left little doubt about the shared heritability of the disorders (see SRF related news story).

Outside the MHC regions, the SGENE group identified significant results in their meta-analyses for an SNP just upstream from the neurogranin gene (NRGN) on 11q24.2, coding for a synaptic protein, and an intronic SNP in transcription factor 4 (TCF4) on 18q21.2.

The MGS group's paper is notable in that it is the first report of an African American schizophrenia GWAS sample. Although none of the genes identified reached genomewide significance, it will be interesting to follow this line of research as researchers try to determine whether different SNPs and/or genes are at play in different populations.

Where to now?
If 8,000 subjects were not enough to identify more than a handful of markers indicating nearby schizophrenia risk loci (most of which were in a region already suspect), what would it take to find a substantial number of the many common variants presumably affecting disease risk? David Collier of the Institute of Psychiatry in London, and a member of the SGENE collaboration, told SRF at the WCSJ in London that he thought genomewide statistical significance might not begin to emerge until samples of 100,000 cases and more than 100,000 controls had been collected. He said that assembling such samples was a challenge, but not impossible, especially since groups like the Wellcome Trust are assembling large control samples to study a range of diseases.

Thus, the field is left with some important questions, beginning with, Is this the time for a course correction in regard to studying the role of common variants, i.e., to steer the ship away from massive GWAS samples and toward other approaches that incorporate endophenotypes (see SRF live discussion) or that attempt to tease apart the complex and varying phenotype of the disease itself (see SRF related news story)? Or is this the time for a steady hand on the GWAS ship's wheel—in for a penny, in for a pound, to mix some metaphors?

Furthermore, how much effort should be spent on probing the contribution of copy number variation, or on resequencing the many strong candidate genes already existing to find new common or rare variants? As Collier told SRF, GWAS of the type reported this week will explain only the risk due to common polymorphisms at the population level, which the ISC estimates at one-third or more of the contribution to the disease, though this number may be more or less. The remainder of genetic variation, said Collier, will come from some combination of CNVs and rare variants.

How best to probe gene effects that only emerge under the influence of variation in other genes (epistasis)? Are family studies going to yield more fruit than case-control studies? The schizophrenia community awaits a consensus response to these questions from the genetics community, and SRF invites readers to begin this discussion.—Hakon Heimer (with additional reporting by Peter Farley).

References:
The International Schizophrenia Consortium; Manuscript preparation, Purcell SM, Wray NR, Stone JL, Visscher PM, O'Donovan MC, Sullivan PF, Sklar P; Data analysis, Purcell Leader SM, Stone JL; GWAS analysis subgroup, Sullivan PF, Ruderfer DM, McQuillin A, Morris DW, O'Dushlaine CT, Corvin A, Holmans PA, O'Donovan MC, Sklar P; Polygene analyses subgroup, Wray NR, Macgregor S, Sklar P, Sullivan PF, O'Donovan MC, Visscher PM; Management committee, Gurling H, Blackwood DH, Corvin A, Craddock NJ, Gill M, Hultman CM, Kirov GK, Lichtenstein P, McQuillin A, Muir WJ, O'Donovan MC, Owen MJ, Pato CN, Purcell SM, Scolnick EM, St Clair D, Stone JL, Sullivan PF, Sklar Leader P; Cardiff University, O'Donovan MC, Kirov GK, Craddock NJ, Holmans PA, Williams NM, Georgieva L, Nikolov I, Norton N, Williams H, Toncheva D, Milanova V, Owen MJ; Karolinska Institutet/University of North Carolina at Chapel Hill, Hultman CM, Lichtenstein P, Thelander EF, Sullivan P; Trinity College Dublin, Morris DW, O'Dushlaine CT, Kenny E, Quinn EM, Gill M, Corvin A; University College London, McQuillin A, Choudhury K, Datta S, Pimm J, Thirumalai S, Puri V, Krasucki R, Lawrence J, Quested D, Bass N, Gurling H; University of Aberdeen, Crombie C, Fraser G, Leh Kuan S, Walker N, St Clair D; University of Edinburgh, Blackwood DH, Muir WJ, McGhee KA, Pickard B, Malloy P, Maclean AW, Van Beck M; Queensland Institute of Medical Research, Wray NR, Macgregor S, Visscher PM; University of Southern California, Pato MT, Medeiros H, Middleton F, Carvalho C, Morley C, Fanous A, Conti D, Knowles JA, Paz Ferreira C, Macedo A, Helena Azevedo M, Pato CN; Massachusetts General Hospital, Stone JL, Ruderfer DM, Kirby AN, Ferreira MA, Daly MJ, Purcell SM, Sklar P; Stanley Center for Psychiatric Research and Broad Institute of MIT and Harvard, Purcell SM, Stone JL, Chambert K, Ruderfer DM, Kuruvilla F, Gabriel SB, Ardlie K, Moran JL, Daly MJ, Scolnick EM, Sklar P. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009 Jul 1. Abstract

Shi J, Levinson DF, Duan J, Sanders AR, Zheng Y, Pe'er I, Dudbridge F, Holmans PA, Whittemore AS, Mowry BJ, Olincy A, Amin F, Cloninger CR, Silverman JM, Buccola NG, Byerley WF, Black DW, Crowe RR, Oksenberg JR, Mirel DB, Kendler KS, Freedman R, Gejman PV. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature. 2009 Jul 1. Abstract

Stefansson H, Ophoff RA, Steinberg S, Andreassen OA, Cichon S, Rujescu D, Werge T, Pietilinen OP, Mors O, Mortensen PB, Sigurdsson E, Gustafsson O, Nyegaard M, Tuulio-Henriksson A, Ingason A, Hansen T, Suvisaari J, Lonnqvist J, Paunio T, Brglum AD, Hartmann A, Fink-Jensen A, Nordentoft M, Hougaard D, Norgaard-Pedersen B, Bttcher Y, Olesen J, Breuer R, Mller HJ, Giegling I, Rasmussen HB, Timm S, Mattheisen M, Bitter I, Rthelyi JM, Magnusdottir BB, Sigmundsson T, Olason P, Masson G, Gulcher JR, Haraldsson M, Fossdal R, Thorgeirsson TE, Thorsteinsdottir U, Ruggeri M, Tosato S, Franke B, Strengman E, Kiemeney LA, Group, Melle I, Djurovic S, Abramova L, Kaleda V, Sanjuan J, de Frutos R, Bramon E, Vassos E, Fraser G, Ettinger U, Picchioni M, Walker N, Toulopoulou T, Need AC, Ge D, Lim Yoon J, Shianna KV, Freimer NB, Cantor RM, Murray R, Kong A, Golimbet V, Carracedo A, Arango C, Costas J, Jnsson EG, Terenius L, Agartz I, Petursson H, Nthen MM, Rietschel M, Matthews PM, Muglia P, Peltonen L, St Clair D, Goldstein DB, Stefansson K, Collier DA; Genetic Risk and Outcome in Psychosis (GROUP), Kahn RS, Linszen DH, van Os J, Wiersma D, Bruggeman R, Cahn W, de Haan L, Krabbendam L, Myin-Germeys I. Common variants conferring risk of schizophrenia. Nature. 2009 Jul 1. Abstract

Comments on News and Primary Papers
Comment by:  Todd LenczAnil Malhotra (SRF Advisor)
Submitted 3 July 2009
Posted 3 July 2009

The three companion papers published in Nature provide important new evidence for a role of the MHC complex and common variation across the genome in risk for schizophrenia. These studies have exploited the availability of comprehensive genotyping technologies, coupled with large cohorts of cases and controls, to identify candidate loci for disease susceptibility.

A notable feature of these papers is the clear willingness of each of the groups to share its data, and to provide overlapping presentations of each others results. The combination of datasets permitted the statistical significance of the MHC findings to emerge, thereby increasing confidence in results. The implication that immune processes may interact with genetic risk to influence schizophrenia risk is consistent with several lines of evidence, including our own small GWAS study (Lencz et al., 2007) implicating cytokine receptors in schizophrenia susceptibility.

Perhaps most intriguing is the finding from the International Schizophrenia Consortium demonstrating that a score test—combining information from many thousands of common variants—can reliably differentiate patients and controls across multiple psychiatric cohorts. These results indicate that hundreds, if not thousands, of genes of small effect may contribute to schizophrenia risk. Moreover, these same genes were shown to contribute to bipolar risk (but not risk for non-psychiatric disorders such as diabetes).

Much more work remains to be done in psychiatric genetics. While the score test accounted for about 3 percent of the observed case-control variance, statistical modeling suggested that common variation could explain as much as one-third or more of the total risk. Nevertheless, there remains a substantial proportion of genetic dark matter (unexplained variance), given the high heritability of a disorder such as schizophrenia. Complementary approaches are needed to further parse the source of the common genetic variance, as well as to identify rare yet highly penetrant mutations. Additional techniques, such as pharmacogenetic studies and endophenotypic research, will help to explicate the functionality and clinical significance of observed risk alleles.

View all comments by Todd Lencz
View all comments by Anil MalhotraComment by:  Daniel Weinberger, SRF Advisor
Submitted 3 July 2009
Posted 3 July 2009

The three Nature papers reporting GWAS results in a large sample of cases of schizophrenia and controls from around Western Europe and the U.S. are decidedly disappointing to those expecting this strategy to yield conclusive evidence of common variants predicting risk for schizophrenia. Why has this extensive and very costly effort not produced more impressive results? There are likely to be many explanations for this, involving the usual refrains about clinical and genetic heterogeneity, diagnostic imprecision, and technical limitations in the SNP chips. But the likely, more fundamental problem in psychiatric genetics involves the biologic complexity of the conditions themselves, which renders them especially poorly suited to the standard GWAS strategy. The GWA analytic model assumes fixed, predictable relationships between genetic risk and illness, but simple relationships between genetic risk and complex pathophysiological mechanisms are unlikely. Many biologic functions show non-linear relationships, and depending on the biologic context, more of a potential pathogenic factor, can make things worse or it can make them better. Studies of complex phenotypes in model systems illustrate that individual gene effects depend upon non-linear interactions with other genes (Toma et al., 2002; Shaoa et al, 2008). Similar observations are beginning to emerge in human disorders, e.g., in risk for cancer (Lo et al., 2008) and depression (Pezawas et al., 2008).

The GWA approach also assumes that diagnosis represents a unitary biological entity, but most clinical diagnoses are syndromal and biologically heterogeneous, and this is especially true in psychiatric disorders. Type 2 diabetes is the clinical expression of changes in multiple physiologic processes, including in pancreatic function, in adipose cell function, as well as in eating behavior. Likewise, hypertension results from abnormalities in many biologic processes (e.g., vascular reactivity, kidney function, CNS control of blood pressure, metabolic factors, sodium regulation), and even a large effect on any specific process within a subset of individuals will seem small when measured in large unrelated samples (Newton-Cheh et al., 2009). In the case of the cognitive and emotional problems associated with psychiatric disorders, the biologic pathways to clinical manifestations are probably much more heterogeneous. While the results of GWAS in disorders like type 2 diabetes and hypertension have been more informative than in the schizophrenia results so far, they, too, have been disappointing, considering all the fanfare about their expectations. But given the pathophysiologic realities of diabetes, hypertension, or psychiatric disorders, how could the effect of any common genetic variant acting on only one of the diverse pathophysiological mechanisms implicated in these disorders be anything other than small when measured in large pathophysiologically heterogeneous populations? Other approaches, e.g., family studies, studies of smaller but much better characterized samples, and studies of genetic interactions in these samples, will be necessary to understand the variable genetic architectures of such biologically complex and heterogeneous disorders.

References:

Toma DP, White KP, Hirsch J and Greenspan RJ: Identification of genes involved in Drosophila melanogaster geotaxis, a complex behavioral trait. Nature Genetics 2002; 31: 349-353. Abstract

Shaoa H, Burragea LC, Sinasac DS et al : Genetic architecture of complex traits: Large phenotypic effects and pervasive epistasis. PNAS 2008 105: 1991019914. Abstract

Lo S-W, Chernoff H, Cong L, Ding Y, and Zheng T: Discovering interactions among BRCA1 and other candidate genes associated with sporadic breast cancer. PNAS 2008; 105: 1238712392. Abstract

Pezawas L, Meyer-Lindenberg A, Goldman AL, et al.: Biologic epistasis between BDNF and SLC6A4 and implications for depression. Mol Psychiatry 2008;13:709-716. Abstract

Newton-Cheh C, Larson MG, Vasan RS: Association of common variants in NPPA and NPPB with circulating natriuretic peptides and blood pressure. Nat Gen 2009; 41: 348-353. Abstract

View all comments by Daniel WeinbergerComment by:  Irving Gottesman, SRF Advisor
Submitted 3 July 2009
Posted 3 July 2009
  I recommend the Primary Papers

The synthesis and extraction of the essence of the 3 Nature papers by Heimer and Farley represents science reporting at its best. Completion of the task while the ink was still wet shows that SRF is indeed in good hands. Congratulations on being concise, even-handed, non-judgmental, and challenging under the pressure of time.

View all comments by Irving GottesmanComment by:  Christopher RossRussell L. Margolis
Submitted 6 July 2009
Posted 6 July 2009

Schizophrenia Genetics: Glass Half Full?
While it may be disappointing that the GWAS described above did not identify more genes, they nevertheless represent a landmark in psychiatric genetics and suggest a dual approach for the future: continued large-scale genetic association studies along with alternative genetic approaches leading to the discovery of new genetic etiologies, and more functional investigations to identify pathways of pathogenesis—which may themselves suggest new etiologies.

The consistent identification of an association with the MHC locus reinforces (without proving, as pointed out in the SRF news story) long-standing interest in the involvement of infectious or immune factors in schizophrenia pathogenesis (Yolken and Torrey, 2008). Epidemiologic and neuropathological studies that include patients selected for the presence or absence of immunologic genetic risk variants could potentially clarify etiology; cell and mouse model studies could clarify pathogenesis (Ayhan et al., 2009). It is striking that a major genetic finding in schizophrenia serves to reinforce the concept of environmental risk factors.

The two specific genes identified by the SGENE consortium, NRGN and TCF4, offer intriguing new leads into schizophrenia. This should foster a number of further genetic and neurobiological studies. Deep resequencing (and CNV analysis) can detect rare causative mutations, as exemplified by TCF4 mutations leading to Pitt-Hopkins syndrome. Neurogranin already has clear connections to interesting signaling pathways related to glutamate transmission. A hope is that further studies of both gene products and their interactions will identify pathogenic pathways.

The ISC used common genetic variants en masse to generate a polygene score from discovery samples of patients; that score was able to predict case status in test populations. The success of this approach provides very strong evidence that a portion of schizophrenia risk status is attributable to common genetic variants acting in concert and that schizophrenia shares genetic factors with bipolar disorder, but not with other diseases. This analysis has multiple practical implications for the direction of research. First, since polygenic factors explain only a portion of the genetic risk, the search for other genetic factors—rare mutations of major effect detectable by deep sequencing, CNVs, variations in tandem repeats (Bruce et al., 2009, in press), and other genomic lesions—takes on new importance. Second, a meaningful integration of polygenic factors in a way that facilitates understanding of schizophrenia pathogenesis and the discovery of therapeutic targets will require identification of relevant pathways. Examination of patient-derived material—such as neurons differentiated from induced pluripotent stem cells taken from well-characterized, patient populations—may be of great value.

The remarkable overlap between the genetic factors of schizophrenia and bipolar disorder suggests the need for further and more inclusive clinical studies—not just of endophenotypes, but also of the phenotypes themselves, together, rather than in isolation (Potash and Bienvenu, 2009). For instance, it is only within the past few years that the importance of cognitive dysfunction in schizophrenia has been appreciated. Cognition in bipolar disorder is even less well studied.

How much is really known about the longitudinal course of both disorders? Do genetic factors predict disease outcome? It is only recently that studies have focused intensively on the early course of schizophrenia and its prodrome. Much more is still to be learned, and even less is known about bipolar disorder. In conjunction with this greater understanding of clinical phenotype, it will clearly be necessary to refine the approach to phenotype by establishing the biological framework for these diseases and by establishing biomarkers, such as disruption in white matter (Karlsgodt et al., 2009) or abnormalities in functional networks (Demirci et al., 2009), that cut across current nosological categories. In turn, longitudinal study of clinical, imaging, and functional outcomes of schizophrenia and bipolar disorders should facilitate both focused candidate genetic studies and GWAS of large populations.

References:

Yolken RH, Torrey EF. Are some cases of psychosis caused by microbial agents? A review of the evidence. Mol Psychiatry. 2008 May;13(5):470-9. Abstract

Ayhan Y, Sawa A, Ross CA, Pletnikov MV. Animal models of gene-environment interactions in schizophrenia. Behav Brain Res. 2009 Apr 18. Abstract

Potash JB, Bienvenu OJ. Neuropsychiatric disorders: Shared genetics of bipolar disorder and schizophrenia. Nat Rev Neurol. 2009 Jun;5(6):299-300. Abstract

Karlsgodt KH, Niendam TA, Bearden CE, Cannon TD. White matter integrity and prediction of social and role functioning in subjects at ultra-high risk for psychosis. Biol Psychiatry. 2009 May 6. Epub ahead of print. Abstract

Demirci O, Stevens MC, Andreasen NC, Michael A, Liu J, White T, Pearlson GD, Clark VP, Calhoun VD. Investigation of relationships between fMRI brain networks in the spectral domain using ICA and Granger causality reveals distinct differences between schizophrenia patients and healthy controls. Neuroimage. 2009 Jun;46(2):419-31. Abstract

Bruce HA, Sachs NA, Rudnicki DD, Lin SG, Willour VL, Cowell JK, Conroy J, McQuaid D, Rossi M, Gaile DP, Nowak NJ, Holmes SE, Sklar P, Ross CA, DeLisi LE, Margolis RL. Long tandem repeats as a form of genomic copy number variation: structure and length polymorphism of a chromosome 5p repeat in control and schizophrenia populations. Psychiatric Genetics, in press.

View all comments by Christopher Ross
View all comments by Russell L. MargolisComment by:  David Collier
Submitted 6 July 2009
Posted 6 July 2009
  I recommend the Primary Papers

This report is unnecessarily negative, from my point of view. The three studies show not only that GWAS can identify susceptibility alleles for schizophrenia, but that the majority of risk comes from common variants of small effect. These can be found, but as in other complex traits and diseases, such as obesity and height, considerable power is needed, because effect sizes are small, meaning greater samples sizes. This approach works: there are now almost 60 variants influencing height (Hirschhorn et al., 2009; Soranzo et al., 2009; Sovio et al., 2009). Furthermore, the genes identified so far from both traditional mapping, CNV analysis and GWAS, point to two biological pathways, the integrity of the synapse (neurexin 1, neurogranin, etc.) and the wnt/GSK3β signaling pathway (DISC1, TCF4, etc.), which is involved in functions such as neurogenesis in the brain. The identification of disease pathways for schizophrenia has major implications and should not be underestimated. It would be daft to lose nerve now and turn away from GWAS just as they are bearing fruit.

I would like to correct/expand on my comments to Peter Farley, to say that while statistical significance for some markers may be reached sooner, significance for many of the hundreds if not thousands of common schizophrenia susceptibility alleles of small effect might not emerge until samples of 100,000 cases and more than 100,000 controls have been collected. Another point is that organizations such the Wellcome Trust are already assembling case samples for schizophrenia as well as control samples.

Also, I would like to clarify that I believe the remainder of genetic variation, after common variation has been taken into account, will come from some combination of rare CNVs, other rare variants such as SNPs and other types of genetic marker such as variable number of tandem repeats (VNTRs) and of course the much neglected contribution from gene-environment interactions, in which main genetic effects may be obscured.

References:

Hirschhorn JN, Lettre G. Progress in genome-wide association studies of human height. Horm Res. 2009 Apr 1 ; 71 Suppl 2():5-13. Abstract

Soranzo N, Rivadeneira F, Chinappen-Horsley U, Malkina I, Richards JB, Hammond N, Stolk L, Nica A, Inouye M, Hofman A, Stephens J, Wheeler E, Arp P, Gwilliam R, Jhamai PM, Potter S, Chaney A, Ghori MJ, Ravindrarajah R, Ermakov S, Estrada K, Pols HA, Williams FM, McArdle WL, van Meurs JB, Loos RJ, Dermitzakis ET, Ahmadi KR, Hart DJ, Ouwehand WH, Wareham NJ, Barroso I, Sandhu MS, Strachan DP, Livshits G, Spector TD, Uitterlinden AG, Deloukas P. Meta-analysis of genome-wide scans for human adult stature identifies novel Loci and associations with measures of skeletal frame size. PLoS Genet. 2009 Apr 1 ; 5(4):e1000445. Abstract

Sovio U, Bennett AJ, Millwood IY, Molitor J, O'Reilly PF, Timpson NJ, Kaakinen M, Laitinen J, Haukka J, Pillas D, Tzoulaki I, Molitor J, Hoggart C, Coin LJ, Whittaker J, Pouta A, Hartikainen AL, Freimer NB, Widen E, Peltonen L, Elliott P, McCarthy MI, Jarvelin MR. Genetic determinants of height growth assessed longitudinally from infancy to adulthood in the northern Finland birth cohort 1966. PLoS Genet. 2009 Mar 1 ; 5(3):e1000409. Abstract

View all comments by David CollierComment by:  Michael O'Donovan, SRF AdvisorNick CraddockMichael Owen (SRF Advisor)
Submitted 9 July 2009
Posted 9 July 2009

Some commentators in their reflections take a rather negative view on what has been achieved through the application of GWAS technology to schizophrenia and psychiatric disorders more generally. We strongly disagree with this position. Below, we give examples of a number of statements that can be made about the aetiology of schizophrenia and bipolar disorder that could not be made at high levels of confidence even two years ago that are based upon evidence deriving from the application of GWAS.

1. We know with confidence that the role of rare copy number variants in schizophrenia is not limited to 22q11DS (VCFS) (reviewed recently in ODonovan et al., 2009). We do not yet know how much of a contribution, but we know the identity of an increasing number of these. Most span multiple genes so it may prove problematic as it has in 22q11DS to identify the relevant molecular mechanisms. However, for one locus, the CNVs are limited to a single gene: Neurexin1 (Kirov et al., 2008; Rujescu et al., 2009). Genetic findings are merely the start of the journey to a deeper biological understanding, but no doubt many neurobiological researchers have already embarked on that journey in respect of neurexin1.

2. Although we have argued in this forum that some of the major pre-GWAS findings in schizophrenia very likely reflect true susceptibility genes (DTNBP1, NRG1, etc), we now have at least 4 novel loci where the evidence is more definitive (ZNF804A, MHC, NRGN, TCF4), (ODonovan et al., 2008a; ISC, 2009; Shi et al., 2009; Stefansson et al., 2009) and two novel loci (Ferreira et al., 2008) in bipolar disorder (ANK3 and CACNA1C), at least one of which (CACNA1C) additionally confers risk of schizophrenia (Green et al., 2009). This is obviously a small part of the picture, but it is certainly better than no picture at all. These findings also offer a much more secure foundation than the earlier findings upon which to build follow up studies, for example brain imaging, and cognitive phenotypes (Esslinger et al., 2009), and even candidate gene studies. We would not regard the first convincing evidence that altered calcium channel function is a primary aetiological event in at least some forms of psychosis as a trivial gain in knowledge.

3. We can say with confidence that common alleles of small effect are abundant in schizophrenia, and that they contribute to a substantial part of the population risk (ISC, 2009). Identifying any one of these at stringent levels of statistical significance may be challenging in terms of sample sizes. As we have pointed out before, merging multiple datasets may indeed obscure some true associations because of sometimes unpredictable relationships between risk alleles and those assayed indirectly in GWAS studies (Moskvina and ODonovan, 2007). Nevertheless, that many of the same alleles are overrepresented in multiple independent GWAS datasets from different countries (ISC, 2009) means that larger samples offer the prospect of identifying many more of these. This is not to say that large samples are the only approach; genetic heterogeneity may well underpin some aspects of clinical heterogeneity (Craddock et al., 2009a). However, with the exception of individual large pedigrees, it is not yet evident which type of clinical sample one should base a small scale study on. It should also be self-evident that the analysis of multiple samples, each with a different phenotypic structure, will pose major problems in respect of multiple testing and subsequent replication. Moreover, ascertaining special samples that represent putative subtypes of the clinical (and endophenotypic) spectrum of psychosis will first require large samples to be carefully assessed and the relevant subjects extracted. Subsequently, downstream, evaluation of specific genotype-phenotype relationships will require the remainder of the clinical population to be genotyped in a suitably powered way to show that those effects are specific to some clinical features of the disorder. Increasingly, it is ascertainment and assessment that dominate the cost of GWAS studies so it is not clear this approach will achieve any economies. We must also remember that after a GWAS study, there remains the opportunity to look in a controlled manner for relatively specific associations in the context of the heterogeneous clinical picture. For example we are aware of a number of papers in development that will exploit the sorts of multi-locus tests reported by the ISC to refine diagnostics, and no doubt many other applications of this will emerge in the next year or so.

Critics should bear in mind that the GWAS data are not just there for the headline genome-wide findings, but that the data will be available to mine for years to come. The findings reported to date are based on only the simplest analyses.

4. If it were the case that the thousands of SNPs of small effect were randomly distributed across biological systems, none being of more relevance to pathophysiology than another, identifying them would probably be a pointless endeavour. However, there is no reason to believe this will be the case. We have recently shown that in bipolar disorder, the GWAS signals are enriched in particular biological pathways (Holmans et al., 2009) and we also published strong evidence for a relatively selective involvement of the GABAergic system in schizoaffective disorder (Craddock et al., 2009b). We are aware of an as-yet unpublished independent sample with similar findings. We would not regard the first convincing evidence that altered GABA function is a primary aetiological event in at least some forms of psychosis as a trivial gain in knowledge.

Incidentally it is a common misconception that the identification of risk alleles of small effect necessarily confers no useful insights into pathogenesis and possible drug targets. For example, common alleles in PPARG and KCNJ11 have been robustly shown to confer risk to Type 2 diabetes (T2D) but with odds ratios in the region of only 1.14 (of similar magnitude to those revealed by GWAS of schizophrenia). PPARG encodes the target for the thiazolidinedione class of drugs used to treat T2D. KCNJ11 encodes part of the target for another class of diabetes drug, the sulphonylureas (Prokopenko et al., 2008). Moreover, studies of novel associated variants identified in T2D GWAS in healthy, non-diabetic, populations have demonstrated that for most, the primary effect on T2D susceptibility is mediated through deleterious effects on insulin secretion, rather than insulin action (Prokopenko et al., 2008). Further examples of insights into the biology of common diseases coming from the identification of loci of small effect are the implication of the complement system in age-related macular degeneration and autophagy in Crohns disease (Hirschhorn, 2009). Already, efforts are under way to translate the new recognition of the role of autophagy in Crohns disease into new therapeutic leads (Hirschhorn, 2009). Of course many of the loci identified in GWAS implicate genes whose functions are as yet largely or completely unknown, and determining those functions is a prerequisite of translating those findings. Nevertheless, we believe that the greatest benefits that will accrue from the continued discovery of risk loci through GWAS will come from the assembly of that information into novel disease pathways leading to novel therapeutic targets.

5. We can say with confidence that bipolar disorder and schizophrenia substantially overlap, at least in terms of polygenic risk (ISC, 2009). As clinicians, we do not regard that knowledge as a trivial achievement.

6. We can say with confidence from studies of CNVs that schizophrenia and autism share at least some risk factors in common (ODonovan et al., 2009). We believe that is also an important insight.

The above are major achievements in what we expect to be a long but accelerating process of picking apart the origins of schizophrenia and other psychotic disorders. We do not think that any other research discipline in psychiatry has done more to advance that knowledge in the past 100 years.

Like that other common familial diseases, the genetics of schizophrenia and bipolar disorder is a mixed economy of common alleles of small effect and rare alleles of large and small effects, including CNVs. Those who are concerned at the cost of collecting large samples for GWAS studies must bear in mind that the robust identification of both types of mutation will require similarly large samples; we will just have to get used to that fact if we want to make progress. Collecting samples like this may be expensive, but as clinicians, we know those costs are trivial compared with the human and economic costs of psychotic disorders.

The question of phenotype definition is one which we have repeatedly addressed (Craddock et al., 2009a). Unquestionably, if we knew the true pathophysiological basis of these disorders, we could do better. The fact is that we dont. Given that, it must be extremely encouraging that despite the problems, risk loci can be robustly identified by GWAS using samples defined by current diagnostic criteria. Moreover, armed with GWAS data in these heterogeneous populations, additional risk genes can be identified through strategies aimed at refining the phenotype that are not constrained by the current dichotomous view of the functional psychoses. We have shown at least one way in which this might be achieved without imposing a further burden of multiple testing (Craddock et al., 2009b), and have little doubt that others will emerge. We agree that approaches to phenotyping that more directly index underlying pathophysiology are highly appealing, and will ultimately be necessary for understanding the mechanistic relationships between gene and disorder. However, the two cardinal assumptions upon which the use of endophenotypes is predicated for gene discovery are questionable. First, there is little good evidence that putative endophenotypes are substantially simpler genetically than exophenotypes (Flint and Munafo, 2007). Second, there is rarely good evidence that the current crop of popular putative endophenotypes lie on the disease pathway, indeed there seems to be substantial pleiotropy in the genetics of complex traits, psychosis included (Prokopenko et al., 2008; ODonovan et al., 2008b).

Finally, we reiterate that while only small parts of the heritability of any complex disorder have been accounted for, large-scale genetic approaches have been extremely successful in studies of non-psychiatric diseases (Manolio et al., 2008) and have led to substantial advances in our understanding of pathogenesis, even for diseases like Crohns disease where there was already prior knowledge of pathogenesis from other research methods (Mathew, 2008).

Psychiatry starts from a situation in which there is no robust prior knowledge of pathogenesis for the major phenotypes. Recent findings suggest that mental illness may be the medical field that will actually benefit most over the coming years from application of these powerful molecular genetic technologies.

References:
Craddock N, O'Donovan MC, Owen MJ. (2009a) Psychosis Genetics: Modeling the Relationship between Schizophrenia, Bipolar Disorder, and Mixed (or "Schizoaffective") Psychoses. Schizophrenia Bulletin 35(3):482-490. Abstract

Craddock N, Jones L, Jones IR, Kirov G, Green EK, Grozeva D, Moskvina V, Nikolov I, Hamshere ML, Vukcevic D, Caesar S, Gordon-Smith K, Fraser C, Russell E, Norton N, Breen G, St Clair D, Collier DA, Young AH, Ferrier IN, Farmer A, McGuffin P, Holmans PA, Wellcome Trust Case Control Consortium (WTCCC), Donnelly P, Owen MJ, ODonovan MC. Strong genetic evidence for a selective influence of GABAA receptors on a component of the bipolar disorder phenotype. Molecular Psychiatry advanced online publication 1 July 2008; doi:10.1038/mp.2008.66. (b) Abstract

Esslinger C, Walter H, Kirsch P, Erk S, Schnell K, Arnold C, Haddad L, Mier D, Opitz von Boberfeld C, Raab K, Witt SH, Rietschel M, Cichon S, Meyer-Lindenberg A. (2009) Neural mechanisms of a genome-wide supported psychosis variant. Science 324(5927):605. Abstract

Ferreira MAR, ODonovan MC, Meng YA, Jones IR, Ruderfer DM, Jones L, Fan J, Kirov G, Perlis RH, Green EK, Smoller JW, Grozeva D, Stone J, Nikolov I, Chambert K, Hamshere ML, Nimgaonkar V, Moskvina V, Thase ME, Caesar S, Sachs GS, Franklin J, Gordon-Smith K, Ardlie KG, Gabriel SB, Fraser C, Blumenstiel B, Defelice M, Breen G, Gill M, Morris DW, Elkin A, Muir WJ, McGhee KA, Williamson R, MacIntyre DJ, McLean A, St Clair D, VanBeck M, Pereira A, Kandaswamy R, McQuillin A, Collier DA, Bass NJ, Young AH, Lawrence J, Ferrier IN, Anjorin A, Farmer A, Curtis D, Scolnick EM, McGuffin P, Daly MJ, Corvin AP, Holmans PA, Blackwood DH, Wellcome Trust Case Control Consortium (WTCCC), Gurling HM, Owen MJ, Purcell SM, Sklar P and Craddock NJ. (2008) Collaborative genome-wide association analysis of 10,596 individuals supports a role for Ankyrin-G (ANK3) and the alpha-1C subunit of the L-type voltage-gated calcium channel (CACNA1C) in bipolar disorder. Nature Genetics 40:1056-1058. Abstract

Flint J, Munaf MR. (2007) The endophenotype concept in psychiatric genetics. Psychological Medicine 37(2):163-180. Abstract

Green EK, Grozeva D, Jones I, Jones L, Kirov G, Caesar S, Gordon-Smith K, Fraser C, Forty L, Russell E, Hamshere ML, Moskvina V, Nikolov I, Farmer A, McGuffin P, Wellcome Trust Case Consortium, Holmans PA, Owen MJ, ODonovan MC and Craddock N. (2009) Bipolar disorder risk allele at CACNA1C also confers risk to recurrent major depression and to schizophrenia. Molecular Psychiatry (in press).

Hirschhorn JN. (2009) Genomewide association studies--illuminating biologic pathways. New England Journal of Medicine 360(17):1699-1701. Abstract

Holmans P, Green E, Pahwa J, Ferreira M, Purcell S, Sklar P, Owen M, ODonovan M, Craddock N. Gene ontology analysis of GWAS datasets provide insights into the biology of bipolar disorder. The American Journal of Human Genetics 2009 Jun 17 [Epub ahead of print]. International Schizophrenia Consortium. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 2009 Jul 1 [Epub ahead of print]. Abstract

Kirov G, Gumus D, Chen W, Norton N, Georgieva L, Sari M, O'Donovan MC, Erdogan F, Owen MJ, Ropers HH, Ullmann R. (2008) Comparative genome hybridization suggests a role for NRXN1 and APBA2 in schizophrenia. Human Molecular Genetics 17(3):458-465. Abstract

Manolio TA, Brooks LD, Collins FS. (2008) A HapMap harvest of insights into the genetics of common disease. Journal of Clinical Investigation 118(5):1590-1605. Abstract

Mathew CG. (2008) New links to the pathogenesis of Crohn disease provided by genome-wide association scans. Nature Review Genetics 9(1):9-14. Abstract

Moskvina V and O'Donovan MC. (2007) Detailed analysis of the relative power of direct and indirect association studies and the implications for their interpretation. Human Heredity 64(1):63-73. Abstract

ODonovan MC, Kirov G, Owen MJ. (2008a) Phenotypic variations on the theme of CNVs. Nature Genetics 40(12):1392-1393. Abstract

ODonovan MC, Craddock N, Norton N, Williams H, Peirce T, Moskvina V, Nikolov I, Hamshere M, Carroll L, Georgieva L, Dwyer S, Holmans P, Marchini JL, Spencer C, Howie B, Leung H-T, Hartmann AM, Mller H-J, Morris DW, Shi Y, Feng G, Hoffmann P, Propping P, Vasilescu C, Maier W, Rietschel M, Zammit S, Schumacher J, Quinn EM, Schulze TG, Williams NM, Giegling I, Iwata N, Ikeda M, Darvasi A, Shifman S, He L, Duan J, Sanders AR, Levinson DF, Gejman P, Molecular Genetics of Schizophrenia Collaboration , Cichon S, Nthen MM, Gill M, Corvin A, Rujescu D, Kirov G, Owen MJ. (2008b) Identification of novel schizophrenia loci by genome-wide association and follow-up. Nature Genetics 40:1053-1055. Abstract

ODonovan MC, Craddock N, Owen MJ. Genetics of psychosis; Insights from views across the genome. Human Genetics 2009 Jun 12 [Epub ahead of print]. Abstract

Prokopenko I, McCarthy MI, Lindgren CM. (2008) Type 2 diabetes: new genes, new understanding. Trends in Genetics 24(12):613-621. Abstract

Rujescu D, Ingason A, Cichon S, Pietilinen OP, Barnes MR, Toulopoulou T, Picchioni M, Vassos E, Ettinger U, Bramon E, Murray R, Ruggeri M, Tosato S, Bonetto C, Steinberg S, Sigurdsson E, Sigmundsson T, Petursson H, Gylfason A, Olason PI, Hardarsson G, Jonsdottir GA, Gustafsson O, Fossdal R, Giegling I, Mller HJ, Hartmann AM, Hoffmann P, Crombie C, Fraser G, Walker N, Lonnqvist J, Suvisaari J, Tuulio-Henriksson A, Djurovic S, Melle I, Andreassen OA, Hansen T, Werge T, Kiemeney LA, Franke B, Veltman J, Buizer-Voskamp JE; GROUP Investigators, Sabatti C, Ophoff RA, Rietschel M, Nthen MM, Stefansson K, Peltonen L, St Clair D, Stefansson H, Collier DA. (2009) Disruption of the neurexin 1 gene is associated with schizophrenia. Human Molecular Genetics 18(5):988-996. Abstract

Shi J, Levinson DF, Duan J, Sanders AR, Zheng Y, Pe'er I, Dudbridge F, Holmans PA, Whittemore AS, Mowry BJ, Olincy A, Amin F, Cloninger CR, Silverman JM, Buccola NG, Byerley WF, Black DW, Crowe RR, Oksenberg JR, Mirel DB, Kendler KS, Freedman R & Gejman PV. (2009) Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature doi:10.1038/nature08192. Abstract

Stefansson H, Ophoff RA, Steinberg S, Andreassen OA, Cichon S, Rujescu D, Werge T, Pietilinen OPH, Mors O, Mortensen PB, Sigurdsson E, Gustafsson O, Nyegaard M, Tuulio-Henriksson A, Ingason A, Hansen T, Suvisaari J, Lonnqvist J, Paunio T, Brglum AD, Hartmann A, Fink-Jensen A, Nordentoft M, Hougaard D, Norgaard-Pedersen B, Bttcher Y, Olesen J, Breuer R, Mller H-J, Giegling I, Rasmussen HB, Timm S, Mattheisen M, Bitter I, Rthelyi JM, Magnusdottir BB, Sigmundsson T, Olason P, Masson G, Gulcher JR, Haraldsson M, Fossdal R, Thorgeirsson TE, Thorsteinsdottir U, Ruggeri M, Tosato S, Franke B, Strengman E, Kiemeney LA, GROUP, Melle I, Djurovic S, Abramova L, Kaleda V, Sanjuan J, de Frutos R, Bramon E, Vassos E, Fraser G, Ettinger U, Picchioni M, Walker N, Toulopoulou T, Need AC, Ge D, Yoon JL, Shianna KV, Freimer NB, Cantor RM, Murray R, Kong A, Golimbet V, Carracedo A, Arango C, Costas J, Jnsson EG, Terenius L, Agartz I, Petursson H, Nthen MM, Rietschel M, Matthews PM, Muglia P, Peltonen L, St Clair D, Goldstein DB, Stefansson K, Collier DA & Genetic Risk and Outcome in Psychosis (GROUP). (2009) Common variants conferring risk of schizophrenia. Nature doi:10.1038/nature08186. Abstract

View all comments by Michael O'Donovan
View all comments by Nick Craddock
View all comments by Michael OwenComment by:  Kevin J. Mitchell
Submitted 9 July 2009
Posted 9 July 2009

GWAS Results: Is the Glass Half Full or 95 Percent Empty?
The publication of the latest schizophrenia GWAS papers represents the culmination of a tremendous amount of work and unprecedented cooperation among a large number of researchers, for which they should be applauded. In addition to the hope of finding new schizophrenia genes, GWAS have been described by some of the researchers involved as, more fundamentally, a stern test of the common variants hypothesis. Based on the meagre haul of common variants dredged up by these three studies and their forerunners, this hypothesis should clearly now be resoundingly rejected—at least in the form that suggests that there is a large, but not enormous, number of such variants, which individually have modest, but not minuscule, effects. There are no common variants of even modest effect.

However, Purcell and colleagues now argue for a model involving vast numbers of variants, each of almost negligible effect alone. The authors show that an aggregate score derived from the top 10-50 percent of a set of 74,000 SNPs from the association results in a discovery sample can predict up to 3 percent of the variance in a target group. Simply put, a set of putative risk alleles can be defined in one sample and shown, collectively, to be very slightly (though highly significantly in a statistical sense) enriched in the test sample, compared to controls. This is consistent across several different schizophrenia samples and even in two bipolar disorder samples. The authors go on to perform a set of control analyses that suggest that these results are not due to obvious population stratification or genotype rate effects (although effects at this level are obviously prone to cryptic artifacts).

If taken at face value, what do these results mean? They imply some kind of polygenic effect on risk, but of what magnitude? The answer to that depends on the interpretation of the additional simulations performed by the authors. They argue that the risk allele set inevitably contains very many false positives, which dilute the predictive power of the real positives hidden among them. Based on this logic, if we only knew which were the real variants to look at, then the variance explained in the target group would be much greater.

To try and estimate the magnitude of the effect of the polygenic load of true risk alleles, the authors conducted a series of simulations, varying parameters such as allele frequencies, genotype relative risks, and linkage disequilibrium with genotyped markers. They claim that these analyses converge on a set of models that recapitulate the observed data and that all converge on a true level of variance explained of around 34 percent, demonstrating a large polygenic component to the genetic architecture of schizophrenia.

These simulations adopt a level of statistical abstraction that should induce a healthy level of skepticism or at least reserved judgment on their findings. Most fundamentally, they rely explicitly for their calculations of the true variance on a liability-threshold model of the genetic architecture of schizophrenia. In effect, the test of the model incorporates the assumption that the model is correct.

The liability-threshold model is an elegant statistical abstraction that allows the application of the powerful statistics of normal distributions. Unfortunately, it suffers from the fact that it has no support whatsoever and makes no biological sense. First, there is no justification for assuming a normal distribution of underlying liability, whatever that term is taken to mean. Second, as usual when it is invoked, the nature of this putative threshold is not explained, though it surreptitiously implies some form of very strong epistasis (to explain the difference in risk between someone with x liability alleles and someone else with x+1 alleles). If this model is not correct, then these simulations are fatally flawed.

Even if the model were correct, the calculations are far from convincing. From a starting set of 560 models, the authors arrive at seven that are consistent with the observed degree of prediction in the target samples. According to the authors, the fact that these seven models converge on a small range of values for the underlying variance explained by the markers is evidence that this value (around 34 percent) represents the true situation. What is not highlighted is the fact that the values for the actual additive genetic variance (taking into account incomplete linkage disequilibrium between the markers and the assumed causal variants) across these models ranges from 34 percent to 98 percent and that the number of SNPs assumed to be having an effect ranges from 4,625 to 74,062. This extreme variation in the derived models hardly inspires confidence in the authors claim that their data strongly support a polygenic basis to schizophrenia that (1) involves common SNPs, [and] (2) explains at least one-third of the total variation in liability. (italics added)

From a more theoretical perspective, it should be noted that a polygenic model involving thousands of common variants of tiny effect cannot explain and will not contribute to the observed heightened familial relative risks. Such risk can only be explained by a variant of large effect or by an oligogenic model involving at most two to three loci (Bodmer and Bonilla, 2008; Hemminki et al., 2008; Mitchell and Porteous, in preparation). It seems much more likely that the observed predictive power in the target samples represents a modest genetic background effect, which could influence the penetrance and expressivity of rare, causal mutations. However, if the point of GWAS is to find genetic variants that are predictive of risk or that shed light on the pathogenic mechanisms of the disease, then clearly, even if such variants can be found by massively increasing sample sizes, their identification alone would not achieve or even appreciably contribute to either of these goals.

References:

Hemminki K, Frsti A, Bermejo JL. The common disease-common variant hypothesis and familial risks. PLoS ONE. 2008 Jun 18;3(6):e2504. Abstract

Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008 Jun;40(6):695-701. Abstract

View all comments by Kevin J. MitchellComment by:  David J. Porteous, SRF Advisor
Submitted 9 July 2009
Posted 10 July 2009
  I recommend the Primary Papers

Thumbs up or down on schizophrenia GWAS?
The triumvirate of schizophrenia GWAS studies just published in Nature gives cause for thought, and bears close scrutiny and reflection. To my reading, these three studies individually and collectively lead to an unambiguous conclusion—there is a lot of genetic heterogeneity and not one individual variant of common ancient origin accounts for a significant fraction of the genetic liability. To put it another way, there is no ApoE equivalent for schizophrenia. Strong past claims for ZNF804A and others look to have fallen by the statistical wayside. Putting the results of all three studies together does appear to provide support for a long known, pre-GWAS association with HLA, but otherwise it is hard to give a strong "thumbs up" to any specific result, not least because of the lack of replication between studies. The results are nevertheless important because the common disease, common variant model, on which GWAS are based and the associated cost justified, is strongly rejected as the main contributor to the genetic variance.

The ISC proposes a highly polygenic model with thousands of variants having an additive effect on both schizophrenia and bipolar disorder. I find no fault with their evidence, but its meaning and interpretation remains speculative. Simply consider the fact that SNPs carefully selected to tag half the genome account for about a third of the variance. It follows that the lion's share has gone undetected and will, by design and limitation, remain impervious to the GWAS strategy.

Part of the GWAS appeal is that the genotyping is technically facile and it is easier to collect lots of cases than it is families, but for as long as a diagnosis of schizophrenia or BP depends upon DSM-IV or ICD-10 classification, then diagnostic uncertainty will have a major effect on true power and validity of statistical association, both positive or negative. Indeed, the longstanding evidence from variable psychopathology amongst related individuals, the recent epidemiology evidence for shared genetic risk for schizophrenia and BP, and the further evidence supporting this from the ISC GWAS, all suggest that we should be returning more to family-based studies as a strategy to reduce genetic heterogeneity and find explanatory genetic variants. Plainly, adding ever more uncertainty through ever larger sample sizes is neither smart nor efficient.

I would certainly give the thumbs up to the full and unencumbered release of the primary data to the community as a whole, as this could usefully recoup some of the GWAS investment. It would facilitate a range of statistical and bioinformatics analyses and, who knows, there might be hidden nuggets of statistical support for independent genetic and biological studies.

View all comments by David J. PorteousComment by:  Sagiv Shifman
Submitted 11 July 2009
Posted 11 July 2009

The main question that arises from the three large genomewide association studies published in Nature is, What should we do next?

One important way forward would be to follow up the association findings in the MHC region. We need to understand the biological mechanism underlying this association. If the association signal is indeed related to infectious diseases, this line of inquiry may lead to the highly desired development of a treatment that might prevent the diseases in some cases.

One possible explanation for the association between schizophrenia and the MHC region (6p22.1) is that infection during pregnancy leads to disturbances of fetal brain development and increases the risk of schizophrenia later in life. A possible test for the theory of infectious diseases as risk factors for schizophrenia would be to study the associated SNPs in 6p22.1 in fathers and mothers of subjects with schizophrenia relative to parents of control subjects. If the 6p22.11 region is related to the tendency of mothers to be infected by viruses during pregnancy, we would expect the SNPs in 6p22.1 to be most strongly associated with being a mother to a subject with schizophrenia.

Another broader and more complicated part of the question is: What would be the best strategy for continued study of the genetic causes of schizophrenia? There shouldnt be only one way to proceed. Testing samples that are 10 times larger seems likely to lead to the identification of more genes, but with much smaller effect size. Testing the association of common variants with schizophrenia is unlikely to lead to the development of genetic diagnostic tools in the near future. If we want to understand the biology of the disease, it might be easier to concentrate our efforts on the identification of rare inherited and non-inherited variants with large effect on the phenotype. Such rare variants are easier to model in animals (relative to common variants with very small functional effect) and might even account for a larger proportion of cases.

View all comments by Sagiv ShifmanComment by:  Alan BrownPaul Patterson
Submitted 17 July 2009
Posted 17 July 2009

The three companion papers in this weeks issue of Nature, in our view, support the case for investigating interaction between susceptibility genes and infectious exposures in schizophrenia. We and others have argued previously that genetic studies conducted in isolation from environmental factors, and studies of environmental influences in the absence of genetic data, are necessarily limited. Maternal influenza, rubella, toxoplasmosis, herpes simplex virus, and other infections have each been associated with an increased risk of schizophrenia, with effect sizes ranging from twofold to over fivefold. While these epidemiologic findings clearly require replication in independent cohorts, two new developments provide further support for the hypothesis. First, a growing number of animal studies of maternal immune activation have documented behavioral and brain phenotypes in offspring that are analogous to findings from clinical research in schizophrenia, and these findings are mediated in large part by specific cytokines (Meyer et al., 2009; Patterson, 2008). Second, recent evidence indicates that maternal infection is also related to deficits in executive and other cognitive functions and neuropathology thought to arise from disruptions in brain development (Brown et al., 2009a; Brown et al., 2009b).

While the MHC region contains genes not involved in the immune system, in light of the epidemiologic findings on maternal infection, it is intriguing to see that this region is once more implicated in genetic studies of schizophrenia as the importance of this region in the response to infectious insults cannot be ignored. Although it is heartening to see that the potential implications of these findings for infectious etiologies were raised in the article from the SGENE plus group, an analysis of the frequency of SNPs by season of birth falls well short of the type of research that will yield definitive findings on the relationships between susceptibility genes and infectious insults. Hence, we advocate a strategy aimed at large scale genetic analyses of schizophrenia cases using birth cohorts with infectious exposures documented from prospectively collected biological samples from the prenatal period. If the schizophrenia-related pathogenic mechanisms by which MHC-related genetic variants operate involve interactions with prenatal infection, we would expect that studies of gene-infection interaction will yield larger effect sizes than those found in these new papers. The evidence from these papers and the epidemiologic literature should also facilitate narrowing of the number of candidate genes to be tested for interactions with infectious insults, thereby ameliorating the potential for type I error due to multiple comparisons.

References:

Meyer U, Feldon J, Fatemi SH. In-vivo rodent models for the experimental investigation of prenatal immune activation effects in neurodevelopmental brain disorders. Neurosci Biobehav Rev . 2009 Jul 1; 33(7):1061-79. Abstract

Patterson PH. Immune involvement in schizophrenia and autism: Etiology, pathology and animal models. Behav Brain Res. 2008 Dec 24; Abstract

Brown AS, Vinogradov S, Kremen WS, Poole JH, Deicken RF, Penner JD, McKeague IW, Kochetkova A, Kern D, Schaefer CA. Prenatal exposure to maternal infection and executive dysfunction in adult schizophrenia. Am J Psychiatry . 2009a Jun 1 ; 166(6):683-90. Abstract

Brown AS, Deicken RF, Vinogradov S, Kremen WS, Poole JH, Penner JD, Kochetkova A, Kern D, Schaefer CA. Prenatal infection and cavum septum pellucidum in adult schizophrenia. Schizophr Res . 2009b Mar 1 ; 108(1-3):285-7. Abstract

View all comments by Alan Brown
View all comments by Paul PattersonComment by:  Javier Costas
Submitted 17 July 2009
Posted 17 July 2009
  I recommend the Primary Papers

Two hundred years after Darwins birth and 150 years after the publication of On the Origin of Species, these three papers in Nature show the important role of natural selection in shaping the genetic architecture of schizophrenia susceptibility. If we compare the GWAS results for schizophrenia with those obtained for other diseases, it seems that there are less common risk alleles and/or lower effect sizes in schizophrenia than in many other complex diseases (see, for instance, the online catalog of published GWAS at NHGRI). This fact strongly suggests that negative selection limits the spread of susceptibility alleles, as expected due to the decreased fertility of schizophrenic patients.

Interestingly, the MHC region may be an exception. This region represents a classical example of balancing selection, i.e., the presence of several variants at a locus maintained in a population by positive natural selection (Hughes and Nei, 1988). In the case of the MHC, this balancing selection seems to be related to pathogen resistance or MHC-dependent mating choice. Therefore, the presence of common schizophrenia susceptibility alleles at this locus might be explained by antagonistic pleiotropic effects of alleles maintained by natural selection.

If negative selection limits the spread of schizophrenia risk alleles, most of the genetic susceptibility to schizophrenia is likely due to rare variants. Resequencing technologies will allow the identification of many of these variants in the near future. In the meantime, it would be interesting to focus our attention on non-synonymous SNPs at low frequency. Based on human-chimpanzee comparisons and human sequencing data, Kryukov et al. (2008) have shown that a large fraction of de novo missense mutations are mildly deleterious (i.e., they are subject to weak negative selection) and therefore they can still reach detectable frequencies. Assuming that most of these mildly deleterious alleles may be detrimental (i.e., they confer risk for disease) the authors conclude that numerous rare functional SNPs may be major contributors to susceptibility to common diseases Kryukov et al., 2008. Similar conclusions were obtained by the analysis of the relative frequency distribution of non-synonymous SNPs depending on their probability to alter protein function (Barreiro et al., 2008; Gorlov et al., 2008). As shown by Evans et al. (2008), genomewide scans of non-synonymous SNPs might complement GWAS, being able to identify rare non-synonymous variants of intermediate penetrance not detectable by current GWAS panels.

References:

Barreiro LB, Laval G, Quach H, Patin E, Quintana-Murci L (2008) Natural selection has driven population differentiation in modern humans. Nat Genet 40: 340-5. Abstract

Evans DM, Barrett JC, Cardon LR (2008) To what extent do scans of non-synonymous SNPs complement denser genome-wide association studies? Eur J Hum Genet 16: 718-23. Abstract

Gorlov IP, Gorlova OY, Sunyaev SR, Spitz MR, Amos CI (2008) Shifting paradigm of association studies: value of rare single-nucleotide polymorphisms. Am J Hum Genet 82: 100-12. Abstract

Hughes AL, Nei M (1988) Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335: 167-70. Abstract

Kryukov GV, Pennacchio LA, Sunyaev SR (2007) Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am J Hum Genet 80: 727-39. Abstract

View all comments by Javier Costas

Comments on Related News


Related News: Schizophrenia, Autoimmune Diseases Linked in Danish Population

Comment by:  Keith Parker
Submitted 22 March 2006
Posted 22 March 2006
  I recommend the Primary Papers

Related News: Schizophrenia, Autoimmune Diseases Linked in Danish Population

Comment by:  Patricia Estani
Submitted 26 March 2006
Posted 26 March 2006
  I recommend the Primary Papers

Related News: A Possible Protective Role for Type 1 Diabetes in Schizophrenia

Comment by:  Jrgen Zielasek
Submitted 20 August 2007
Posted 20 August 2007
  I recommend the Primary Papers

This is an interesting epidemiological finding. A reproduction of this association in other populations would be needed. Of note, one of the autoantigens of type I diabetes (GAD, glutamic acid decarboxylase) is expressed in the nervous system and autoantibodies against GAD lead to a rare neurological disorder, Stiff-Person-Syndrome. Immunologically, type I diabetes is very complex, involving cellular and humoral immune effector mechanisms. How this may protect against the development of schizophrenia is not easily explained.

View all comments by Jrgen Zielasek

Related News: WCPG 2007—Schizophrenia, Bipolar GWA Results Prompt Calls for Bigger Samples

Comment by:  William Carpenter, SRF Advisor (Disclosure)
Submitted 7 November 2007
Posted 8 November 2007

Terrific update and summary for those of us not attending the meeting.

View all comments by William Carpenter

Related News: Copy Number Variations in Schizophrenia: Rare But Powerful?

Comment by:  Daniel Weinberger, SRF Advisor
Submitted 27 March 2008
Posted 27 March 2008

The paper by Walsh et al. is an important addition to the expanding literature on copy number variations in the human genome and their potential role in causing neuropsychiatric disorders. It is clear that copy number variations are important aspects of human genetic variation and that deletions and duplications in diverse genes throughout the genome are likely to affect the function of these genes and possibly the development and function of the human brain. So-called private variations, such as those described in this paper, i.e., changes in the genome found in only a single individual, as all of these variations are, are difficult to establish as pathogenic factors, because it is hard to know how much they contribute to the complex problem of human behavioral variation in a single individual. If the change is private, i.e., only in one case and not enriched in cases as a group, as are common genetic polymorphisms such as SNPs, how much they account for case status is very difficult to prove.

An assumption implicit in this paper is that these private variations may be major factors in the case status of the individuals who have them. The data of this paper suggest, however, this is actually not the case, at least for the childhood onset cases. Heres why: mentioned in the paper is a statement that only two of the CNVs in the childhood cases are de novo, i.e., spontaneous and not inherited (and one of these is on the Y chromosome, making its functional implications obscure). If most of the CNVs are inherited, they cant be causing illness per se as major effect players because they are coming from well parents.

Also, if you add up all CNVs in transmitted and non-transmitted chromosomes of the parents, its something like 31 gene-based CNVs in 154 parents (i.e., 20 percent of the parents have a gene-based deletion or duplication in the very illness-related pathways that are highlighted in the cases), which is at least as high a frequency as in the adult-onset schizophrenia sample in this studyand three times the frequency as found in the adult controls. This is not to say that such variants might not represent susceptibility genetic factors, or show variable penetrance between individuals, like other polymorphisms, and contribute to the complex genetic risk architecture, like other genetic variations that have been more consistently associated with schizophrenia. However, the CNV literature has tended to seek a more major effect connotation to the findings.

View all comments by Daniel Weinberger

Related News: Copy Number Variations in Schizophrenia: Rare But Powerful?

Comment by:  William Honer
Submitted 28 March 2008
Posted 28 March 2008
  I recommend the Primary Papers

As new technologies are applied to understanding the etiology and pathophysiology of schizophrenia, considering the clinical features of the cases studied and the implications of the findings is of value. The conclusion of the Walsh et al. paper, these results suggest that schizophrenia can be caused by rare mutations. is worth considering carefully.

What evidence is needed to link an observation in the laboratory or clinic to cause? Recent recommendations for the content of papers in epidemiology (von Elm et al., 2008) remind us of the suggestions of A.V. Hill (Hill, 1965). To discern the implications of a finding, or association, for causality, Hill suggests assessment of the following:

1. Strength of the association: this is not the observed p-value, but a measure of the magnitude of the association. In the Walsh et al. study, the primary outcome measure, structural variants duplicating or deleting genes was observed in 15 percent of cases, and 5 percent of controls. But what is the association with? The diagnostic entity of schizophrenia, or some risk factor for the illness? Of interest, and noted in the Supporting Online Material, these variants were present in 7/15 (47 percent) of the cases with presumed IQ <80, but only 15/135 (11 percent) of the cases with IQ >80. Are the structural variants more strongly associated with mental retardation (within schizophrenia 47 percent vs. 11 percent) than with diagnosis (11 percent vs. 5 percent of controls, assuming normal IQ)? This is of particular interest in the context of the speculation in the paper concerning the importance of genes putatively involved with brain development in the etiology of schizophrenia.

2. Consistency of results in the literature across studies and research groups: there are now several papers examining copy number variation in schizophrenia, including a report from our group (Wilson et al., 2006). The authors of the present paper state that each variant observed was unique, and so consistency of the specific findings could be argued to be irrelevant, if the model is of novel mutations (more on models below). Undoubtedly, future meta-analyses and accumulating databases help determine if there is anything consistent in the findings, other than a higher frequency of any abnormalities in cases rather than controls.

3. Specificity of the findings to the illness in question: this was not addressed experimentally in the paper. However, the findings of more abnormalities in the putative low IQ cases, and the similarity of the findings to reports in autism and mental retardation, suggest that this criterion for supporting causality is unlikely to be met.

4. Temporality: the abnormalities should precede the illness. If DNA from terminally differentiated neurons harbors the same variants as DNA from constantly renewed populations of lymphocytes, then clearly this condition is met. While it seems highly likely that this is the case, it is worthwhile considering the possibility that DNA structure may vary between tissue types, or between cell populations. Even within human brain there is some evidence for chromosomal heterogeneity (Rehen et al., 2005).

5. Biological gradient: presence of a dose-response curve strengthens the likelihood of a causal relationship. This condition is not met within cases: only 1/115 appeared to have more than one variant. However, in the presumably more severe childhood onset form of schizophrenia, four individuals carried multiple variants, and the observation of a higher prevalence of variants overall. Still, the question of what the observations of CNV are associated with is relevant, since one of the inclusion/exclusion criteria for COS allowed IQ 65-80, and it is uncertain how many of these cases had some degree of intellectual deficit.

6. Plausibility: biological likelihood—quite difficult to satisfy as a criterion, in the context of the limits of knowledge concerning the mechanisms of illness of schizophrenia.

7. Coherence of the observation with known facts about the illness: the genetic basis of schizophrenia is quite well studied, and there is no dearth of theories concerning genetic architecture. However, a coherent model remains lacking. As examples, the suggestion is made that the observations concerning inherited CNVs in the COS cases are linked with a severe family history in this type of illness. This appears inconsistent with a high penetrance model for CNVs as suggested in the opening of the paper (presuming the parents in COS families are unaffected, as would seem likely). Elsewhere, CNVs are proposed by the authors to be related to de novo events, and an interaction with an environmental modifier, folate (and exposure to famine), is posited (McClellan et al., 2006). A model of the effects of CNVs, which generates falsifiable hypotheses is needed.

8. Experiment: the ability to intervene clinically to modify the effects of CNVs disrupting genes seems many years away.

9. Analogy: the novelty of the CNV findings is both intriguing, but limiting in understanding the likelihood of causal relationships.

The intersection of clinical realities and novel laboratory technologies will fuel the need for better translational research in schizophrenia for many, many more years.

References:

von Elm E, Altman DG, Egger M, Pocock SJ, Gtzsche PC, Vandenbroucke JP. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. J Clin Epidemiol. 2008 Apr 1;61(4):344-349. Abstract

HILL AB. THE ENVIRONMENT AND DISEASE: ASSOCIATION OR CAUSATION? Proc R Soc Med. 1965 May 1;58():295-300. Abstract

Wilson GM, Flibotte S, Chopra V, Melnyk BL, Honer WG, Holt RA. DNA copy-number analysis in bipolar disorder and schizophrenia reveals aberrations in genes involved in glutamate signaling. Hum Mol Genet. 2006 Mar 1;15(5):743-9. Abstract

Rehen SK, Yung YC, McCreight MP, Kaushal D, Yang AH, Almeida BSV, Kingsbury MA, Cabral KMS, McConnell MJ, Anliker B, Fontanoz M, Chun J: Constitutional aneuploidy in the normal human brain. J Neurosci 2005; 25:2176-2180. Abstract

McClellan JM, ESusser E, King M-C: Maternal famine, de novo mutations, and schizophrenia. JAMA 2006; 296:582-584. Abstract

View all comments by William Honer

Related News: Copy Number Variations in Schizophrenia: Rare But Powerful?

Comment by:  Todd LenczAnil Malhotra (SRF Advisor)
Submitted 30 March 2008
Posted 30 March 2008

The new study by Walsh et al. (2008), as well as recent data from other groups working in schizophrenia, autism, and mental retardation, make a strong case for including copy number variants as an important source of risk for neurodevelopmental phenotypes. These findings raise several intriguing new questions for future research, including: the degree of causality/penetrance that can be attributed to individual CNVs; diagnostic specificity; and recency of their origins. While these questions are difficult to address in the context of private mutations, one potential source of additional information is the examination of common, recurrent CNVs, which have not yet been systematically studied as potential risk factors for schizophrenia.

Still, the association of rare CNVs with schizophrenia provides additional evidence that genetic transmission patterns may be a complex hybrid of common, low-penetrant alleles and rare, highly penetrant variants. In diseases ranging from Parkinson's to colon cancer, the literature demonstrates that rare penetrant loci are frequently embedded within an otherwise complex disease. Perhaps the most well-known example involves mutations in amyloid precursor protein and the presenilins in Alzheimers disease (AD). Although extremely rare, accounting for <1 percent of all cases of AD, identification of these autosomal dominant subtypes greatly enhanced understanding of pathophysiology. Similarly, a study of consanguineous families in Iran has very recently identified a rare autosomal recessive form of mental retardation (MR) caused by glutamate receptor (GRIK2) mutations, thereby opening new avenues of research (Motazacker et al., 2007). In schizophrenia, we have recently employed a novel, case-control approach to homozygosity mapping (Lencz et al., 2007), resulting in several candidate loci that may harbor highly penetrant recessive variants. Taken together, these results suggest that a diversity of methodological approaches will be needed to parse genetic heterogeneity in schizophrenia.

References:

Motazacker MM, Rost BR, Hucho T, Garshasbi M, Kahrizi K, Ullmann R, Abedini SS, Nieh SE, Amini SH, Goswami C, Tzschach A, Jensen LR, Schmitz D, Ropers HH, Najmabadi H, Kuss AW. (2007) A defect in the ionotropic glutamate receptor 6 gene (GRIK2) is associated with autosomal recessive mental retardation. Am J Hum Genet. 81(4):792-8. Abstract

Lencz T, Lambert C, DeRosse P, Burdick KE, Morgan TV, Kane JM, Kucherlapati R,Malhotra AK (2007). Runs of homozygosity reveal highly penetrant recessive loci in schizophrenia. Proc Natl Acad Sci U S A. 104(50):19942-7. Abstract

View all comments by Todd Lencz
View all comments by Anil Malhotra

Related News: Copy Number Variations in Schizophrenia: Rare But Powerful?

Comment by:  Ben Pickard
Submitted 31 March 2008
Posted 31 March 2008

In my mind, the study of CNVs in autism (and likely soon in schizophrenia/bipolar disorder, which are a little behind) is likely to put biological meat on the bones of illness etiology and finally lay to rest the annoyingly persistent taunts that genetics hasnt delivered on its promises for psychiatric illness.

I dont think its necessary at the moment to wring our hands at any inconsistencies between the Walsh et al. and previous studies of CNV in schizophrenia (e.g., Kirov et al., 2008). There are a number of factors which I think are going to influence the frequency, type, and identity of CNVs found in any given study.

1. CNVs are going to be found at the rare/penetrant/familial end of the disease allele spectrum—in direct contrast to the common risk variants which are the targets of recent GWAS studies. In the short term, we are likely to see a large number of different CNVs identified. The nature of this spectrum, however, is that there will be more common pathological CNVs which should be replicated sooner—NRXN1, APBA2 (Kirov et al., 2008), CNTNAP2 (Friedman et al., 2008)—and may be among some of these low hanging fruit. For the rarer CNVs, proving a pathological role is going to be a real headache. Large studies or meta-analyses are never going to yield significant p-values for rare CNVs which, nevertheless, may be the chief causes of illness for those few individuals who carry them. Showing clear segregation with illness in families is likely to be the only means to judge their role. However, we must not expect a pure cause-and-effect role for all CNVs: even in the Scottish t(1;11) family disrupting the DISC1 gene, there are several instances of healthy carriers.

2. Sample selection is also likely to be critical. In the Kirov paper, samples were chosen to represent sporadic and family history-positive cases equally. In the Walsh paper, samples were taken either from hospital patients (the majority) or a cohort of childhood onset schizophrenia. Detailed evidence for family history on a case-by-case basis was not given but appeared far stronger in the childhood onset cases. CNVs appeared to be more prevalent, and as expected, more familial, in the latter cohort. A greater frequency was also observed in the Kirov study familial subset.

3. Inclusion criteria are likely to be important—particularly in the more sporadic cases without family history. This is because CNVs found in this group may be commoner and less penetrant—they will be more frequent in cases than in controls but not exclusively found in cases. Any strategy, such as that used in the Kirov paper, which discounts a CNV based on its presence—even singly—in the control group is likely to bias against this class.

4. Technical issues. Certainly, the coverage/sensitivity of the method of choice for the event discovery stage is going to influence the minimum size of CNV detectable. However, a more detailed coverage often comes with a greater false-positive rate. Technique choice may also have more general issues. In both of the papers, the primary detection method is based on hybridization of case and pooled control genomes prior to detection on a chip. Thus, a more continuously distributed output may result—and the extra round of hybridization might bias against certain sequences. More direct primary approaches such as Illumina arrays or a second-hand analysis of SNP genotyping arrays may provide a more discrete copy number output, but these, too, can suffer from interpretational issues.

The other major implication of these and other CNV studies is the observation that certain genes ignore traditional disease boundaries. For example, NRXN1 CNVs have now been identified in autism and schizophrenia, and CNTNAP2 translocations/CNVs have been described in autism, Gilles de la Tourette syndrome, and schizophrenia/epilepsy. This mirrors the observation of common haplotypes altering risk across the schizophrenia-bipolar divide in numerous association studies. It might be the case that these more promiscuous genes are likely to be involved in more fundamental CNS processes or developmental stages—with the precise phenotypic outcome being defined by other variants or environment. The presence of mental retardation comorbid with psychiatric diagnoses in a number of CNV studies suggests that this might be the case. I look forward to the Venn diagrams of the future which show us the shared neuropsychiatric and disease-specific genes/gene alleles. It will also be interesting to see if the large deletions/duplications involving numerous genes give rise to more severe, familial, and diagnostically more defined syndromes or, alternatively, a more diffuse phenotype. Certainly, it has not been easy to dissect out individual gene contributions to phenotype in VCFS or the minimal region in Down syndrome.

References:

Friedman JI, Vrijenhoek T, Markx S, Janssen IM, van der Vliet WA, Faas BH, Knoers NV, Cahn W, Kahn RS, Edelmann L, Davis KL, Silverman JM, Brunner HG, van Kessel AG, Wijmenga C, Ophoff RA, Veltman JA. CNTNAP2 gene dosage variation is associated with schizophrenia and epilepsy. Mol Psychiatry. 2008 Mar 1;13(3):261-6. Abstract

View all comments by Ben Pickard

Related News: Copy Number Variations in Schizophrenia: Rare But Powerful?

Comment by:  Christopher RossRussell L. Margolis
Submitted 3 April 2008
Posted 3 April 2008

We agree with the comments of Weinberger, Lencz and Malhotra, and Pickard, and the question raised by Honer about the extent to which the association may be more to mental retardation than schizophrenia. These new studies of copy number variation represent important advances, but need to be interpreted carefully.

We are now getting two different kinds of data on schizophrenia, which can be seen as two opposite poles. The first is from association studies with common variants, in which large numbers of people are required to see significance, and the strengths of the associations are quite modest. These kinds of vulnerability factors would presumably contribute a very modest increase in risk, and many taken together would cause the disease. By contrast, the private mutations, as identified by the Sebat study, could potentially be completely causative, but because they are present in only single individuals or very small numbers of individuals, it is difficult to be certain of causality. Furthermore, since some of them in the early-onset schizophrenia patients were present in unaffected parents, one might have to assume the contribution of a common variant vulnerability (from the other parent) as well.

If a substantial number of the private structural mutations are causal, then one might expect to have seen multiple small Mendelian families segregating a structural variant. The situation would then be reminiscent of the autosomal dominant spinocerebellar ataxis, in which mutations (currently about 30 identified loci) in multiple different genes result in similar clinical syndromes. The existence of many small Mendelian families would be less likely if either 1) structural variants that cause schizophrenia nearly always abolish fertility, or 2) some of the SVs detected by Walsh et al. are risk factors, but are usually not sufficient to cause disease. The latter seems more likely.

We think these two poles highlight the continued importance of segregation studies, as have been used for the DISC1 translocation. In order to validate these very rare private copy number variations, we believe that it would be important to look for sequence variations in the same genes in large numbers of schizophrenia and control subjects, and ideally to do so in family studies.

One very exciting result of the new copy number studies is the implication of whole pathways rather than just single genes. This highlights the importance of a better understanding of pathogenesis. The study of candidate pathways should help facilitate better pathogenic understanding, which should result in better biomarkers and potentially improve classification and treatment. In genetic studies, development of pathway analysis will be fruitful. Convergent evidence can come from studies of pathogenesis in cell and animal models, but this will need to be interpreted with caution, as it is possible to make a plausible story for so many different pathways (Ross et al., 2006). The genetic evidence will remain critical.

References:

Ross CA, Margolis RL, Reading SA, Pletnikov M, Coyle JT. Neurobiology of schizophrenia. Neuron. 2006 Oct 5;52(1):139-53. Abstract

View all comments by Christopher Ross
View all comments by Russell L. Margolis

Related News: Copy Number Variations in Schizophrenia: Rare But Powerful?

Comment by:  Michael Owen, SRF AdvisorMichael O'Donovan (SRF Advisor)George Kirov
Submitted 15 April 2008
Posted 15 April 2008

The idea that a proportion of schizophrenia is associated with rare chromosomal abnormalities has been around for some time, but it has been difficult to be sure whether such events are pathogenic given that most are rare. Two instances where a pathogenic role seems likely are first, the balanced ch1:11 translocation that breaks DISC1, where pathogenesis seems likely due to co-segregation with disease in a large family, and second, deletion of chromosome 22q11, which is sufficiently common for rates of psychosis to be compared with that in the general population. This association came to light because of the recognizable physical phenotype associated with deletion of 22q11, and the field has been waiting for the availability of genome-wide detection methods that would allow the identification of other sub-microscopic chromosomal abnormalities that might be involved, but whose presence is not predicted by non-psychiatric syndromal features. This technology is now upon us in the form of various microarray-based methods, and we can expect a slew of studies addressing this hypothesis in the coming months.

Structural chromosomal abnormalities can take a variety of forms, in particular, deletions, duplication, inversions, and translocations. Generally speaking, these can disrupt gene function by, in the case of deletions, insertions and unbalanced translocations, altering the copy number of individual genes. These are sometimes called copy number variations (CNVs). Structural chromosomal abnormalities can also disrupt a gene sequence, and such disruptions include premature truncation, internal deletion, gene fusion, or disruption of regulatory or promoter elements.

It is, however, worth pointing out that structural chromosomal variation in the genome is common—it has been estimated that any two individuals on average differ in copy number by a total of around 6 Mb, and that the frequency of individual duplications or deletions can range from common through rare to unique, much in the same way as other DNA variation. Also similar to other DNA variation, many structural variants, indeed almost certainly most, may have no phenotypic effects (and this includes those that span genes), while others may be disastrous for fetal viability. Walsh and colleagues have focused upon rare structural variants, and by rare they mean events that might be specific to single cases or families. For this reason, they specifically targeted CNVs that had not previously been described in the published literature or in the Database of Genomic Variants. The reasonable assumption was made that this would enrich for CNVs that are highly penetrant for the disorder. Indeed, Walsh et al. favor the hypothesis that genetic susceptibility to schizophrenia is conferred not by relatively common disease alleles but by a large number of individually rare alleles of high penetrance, including structural variants. As we have argued elsewhere (Craddock et al., 2007), it seems entirely plausible that schizophrenia reflects a spectrum of alleles of varying effect sizes including common alleles of small effect and rare alleles of larger effect, but data from genetic epidemiology do not support the hypothesis that the majority of the disorder reflects rare alleles of large effect.

Walsh et al. found that individuals with schizophrenia were >threefold more likely than controls to harbor rare CNVs that impacted on genes, but in contrast, found no significant difference in the proportions of cases and controls carrying rare mutations that did not impact upon genes. They also found a similar excess of rare structural variants that deleted or duplicated one or more genes in an independent series of cases and controls, using a cohort with childhood onset schizophrenia (COS).

The results of the Walsh study are important, and clearly suggest a role for structural variation in the etiology of schizophrenia. There are, however, a number of caveats and issues to consider. First, it would be unwise on the basis of that study to speculate on the likely contribution of rare variants to schizophrenia as a whole. It is likely correct that, due to selection pressures, highly penetrant alleles for disorders (like schizophrenia) that impair reproductive fitness are more likely to be of low frequency than they are to be common, but this does not imply that the converse is true. That is, one cannot assume that the penetrance of low frequency alleles is more likely to be high than low. Thus, and as pointed out by Walsh et al., it is not possible to know which or how many of the unique events observed in their study are individually pathogenic. Whether individual loci contribute to pathogenesis (and their penetrances) is, as we have seen, hard to establish. Estimating penetrance by association will require accurate measurement of frequencies in case and control populations, which for rare alleles, will have to be very large. Alternatively, more biased estimates of penetrance can be estimated from the degree of co-segregation with disease in highly multiplex pedigrees, but these are themselves fairly rare in schizophrenia, and pedigrees segregating any given rare CNV obviously even more so.

As Weinberger notes, the case for high penetrance (at the level of being sufficient to cause the disorder) is also undermined by their data from COS, where the majority of variants were inherited from unaffected parents. This accords well with the observation that 22q11DS, whilst conferring a high risk of schizophrenia, is still only associated with psychosis in ~30 percent of cases. It also accords well with the relative rarity of pedigrees segregating schizophrenia in a clearly Mendelian fashion, though the association of CNVs with severe illness of early onset might be expected to reduce the probability of transmission.

Third, there are questions about the generality of the findings. Cases in the case control series were ascertained in a way that enriched for severity and chronicity. Perhaps more importantly, the CNVs were greatly overrepresented in people with low IQ. Thus, one-third of all the potentially pathogenic CNVs in the case control series were seen in the tenth of the sample with IQ less than 80. The association between structural variants and low IQ is well known, as is the association between low IQ and psychotic symptoms, and it seems plausible to assume that forms of schizophrenia accompanied by mental retardation (MR) are likely to be enriched for this type of pathogenesis. The question that arises is whether the CNVs in such cases act simply by influencing IQ, which in turn has a non-specific effect on increasing risk of schizophrenia, or whether there are specific CNVs for MR plus schizophrenia, and some which may indeed increase risk of schizophrenia independent of IQ. In the case of 22q11 deletion, risk of schizophrenia does not seem to be dependent on risk of MR, but more work is needed to establish that this applies more generally.

Another reason to caution about the generality of the effect is that Walsh et al. found that cases with onset of psychotic symptoms at age 18 or younger were particularly enriched for CNVs, being greater than fourfold more likely than controls to harbor such variants. There did remain an excess of CNVs in cases with adult onset, supporting a more general contribution, although it should be noted that even in this group with severe disorder, this excess was not statistically significant (Fishers exact test, p = 0.17, 2-tailed, our calculation). The issue of age of onset clearly impacts upon assessing the overall contribution CNVs may make upon psychosis, since onset before 18, while not rare, is also not typical. A particular contribution of CNVs to early onset also appears supported by the second series studied, which had COS. However, this is a particularly unusual form of schizophrenia which is already known to have high rates of chromosomal abnormalities. Future studies of more typical samples will doubtless bear upon these issues.

Even allowing for the fact that many more CNVs may be detected as resolution of the methodology improves, the above considerations suggest it is premature to conclude a substantial proportion of cases of schizophrenia can be attributed to rare, highly penetrant CNVs. Nevertheless, even if it turns out that only a small fraction of the disorder is attributable to CNVs, as seen for other rare contributors to the disorder (e.g., DISC1 translocation), such uncommon events offer enormous opportunities for advancing our knowledge of schizophrenia pathogenesis.

References:

Craddock N, O'Donovan MC, Owen MJ. Phenotypic and genetic complexity of psychosis. Invited commentary on ... Schizophrenia: a common disease caused by multiple rare alleles.Br J Psychiatry. 2007 90:200-3. Abstract

View all comments by Michael Owen
View all comments by Michael O'Donovan
View all comments by George Kirov

Related News: Copy Number Variations in Schizophrenia: Rare But Powerful?

Comment by:  Ridha JooberPatricia Boksa
Submitted 2 May 2008
Posted 4 May 2008

Walsh et al. claim that rare and severe chromosomal structural variants (SVs) (i.e., not described in the literature or in the specialized databases as of November 2007) are highly penetrant events each explaining a few, if not singular, cases of schizophrenia.

However, their definition of rareness is questionable. Indeed, it is unclear why SVs that are rare (<1 percent) but previously described should be omitted from their analysis. In addition, contrary to their own definition of rareness, the authors included in the COS sample several SVs that have been previously mentioned in the literature (e.g. 115 kb deletion on chromosome 2p16.3 disrupting NRXN1). Furthermore, some of these SVs (entire Y chromosome duplication) are certainly not rare (by the authors definition), nor highly penetrant with regard to psychosis (Price et al., 1967). Finally, as their definition of rareness depends on a specific date, the results of this study will change over time.

As to the assessment of severity, it can equally be concluded from table 2 and using their statistical approach that "patients with schizophrenia are significantly more likely to harbor rare structural variants (6/150) that do not disrupt any gene compared to controls(2/268) (p = 0.03)", thus contradicting their claim. In fact, had they used criteria in the literature (Lee et al., 2007; (Brewer et al., 1999) (i.e., deletion SVs are more likely than duplications to be pathogenic) and appropriate statistical contrasts, deletions are significantly (p = 0.02) less frequent in patients (5/23) than in controls (9/13) who have SVs. In addition, the assumption of high penetrance is questionable given the high level (13 percent) of non-transmitted SVs in parents of COS patients. Is the rate of psychosis proportionately high in the parents? From the data presented, we know that only 2/27 SVs in COS patients are de novo and that some SVs are transmitted. Adding this undetermined number of transmitted SVs to the reported non-transmitted SVs will lead to an even larger proportion of parents carrying SVs. Disclosing the inheritance status of SVs in COS patients along with information on diagnoses in parents from this rigorously characterised cohort, represents a major criterion for assessing the risk associated with these SVs.

Consequently, it appears that the argument of rareness is rather idiosyncratic and contains inconsistencies, and the one of severity is very open to interpretation. Most importantly, it should be emphasized that amalgamated gene effects at the population level do not allow one to conclude that any single SV actually contributes to schizophrenia in an individual. Thus it is unclear how this study of grouped events differs from the thousands of controversial and underpowered association studies of single genes.

References:

Price WH, Whatmore PB. Behaviour Disorders and Pattern of Crime among XYY males Identified at a Maximum Security Hospital. Brit Med J 1967;1:533-6.

Lee C, Iafrate AJ, Brothman AR. Copy number variations and clinical cytogenetic diagnosis of constitutional disorders. Nat Genet 2007 July;39(7 Suppl):S48-S54.

Brewer C, Holloway S, Zawalnyski P, Schinzel A, FitzPatrick D. A chromosomal duplication map of malformations: regions of suspected haplo- and triplolethality--and tolerance of segmental aneuploidy--in humans. Am J Hum Genet 1999 June;64(6):1702-8.

View all comments by Ridha Joober
View all comments by Patricia Boksa

Related News: More Evidence for CNVs in Schizophrenia Etiology—Jury Still Out on Practical Implications

Comment by:  Christopher RossRussell L. Margolis
Submitted 1 August 2008
Posted 1 August 2008

The two recent papers in Nature, from the Icelandic group (Stefansson et al., 2008), and the International Schizophrenia Consortium (2008) led by Pamela Sklar, represent a landmark in psychiatric genetics. For the first time two large studies have yielded highly significant consistent results using multiple population samples. Furthermore, they arrived at these results using quite different methods. The Icelandic group used transmission screening and focused on de novo events, using the Illumina platform in both a discovery population and a replication population. By contrast, the ISC study was a large population-based case-control study using the Affymetrix platform, which did not specifically search for de novo events.

Both identified the same two regions on chromosome 1 and chromosome 15, as well as replicating the previously well studied VCFS region on chromosome 22. Thus, we now have three copy number variants which are replicated and consistent across studies. This provides data on rare highly penetrant variants complementary to the family based study of DISC1 (Porteous et al., 2006), in which the chromosomal translocation clearly segregates with disease, but in only one family. In addition, they are in general congruent with three other studies (Walsh et al., 2008; Kirov et al., 2008; Xu et al., 2008) which also demonstrate a role for copy number variation in schizophrenia. These studies together should put to rest many of the arguments about the value of genetics in psychiatry, so that future studies can now begin from a firmer base.

However, these studies also raise at least as many questions as they answer. One is the role of copy number variation in schizophrenia in the general population. The number of cases accounted for by the deletions on chromosome 1 and 15 in the ISC and Icelandic studies is extremely small--on the order of 1% or less. The extent to which copy number variation, including very rare or even private de novo variants, will account for the genetic risk for schizophrenia in the general population is still unknown. The ISC study indicated that there is a higher overall load of copy number variations in schizophrenia, broadly consistent with Walsh et al and Xu et al but backed up by a much larger sample size, allowing the results to achieve high statistical significance. The implications of these findings are still undeveloped,

Another issue is the relationship to the phenotype of schizophrenia in the general population. Many more genotype-phenotype studies will need to be done. It will be important to determine whether there is a higher rate of mental retardation in the schizophrenia in these studies than in other populations.

Another question is the relationship between these copy number variations (and other rare events) and the more common variants accounting for smaller increases in risk, as in the recent ODonovan et al. (2008) association study in Nature Genetics. It is far too early to know, but there may well be some combination of rare mutations plus risk alleles that account for cases in the general population. This would then be highly reminiscent of Alzheimers disease, Parkinsons disease, and other diseases which have been studied for a longer period of time.

For instance, in Alzheimers disease there are rare mutations in APP and presenilin, as well as copy number variation in APP, with duplications causing the accelerated Alzheimers disease seen in Down syndrome. These appear to interact with the risk allele in APOE, and possibly other risk alleles, and are part of a pathogenic pathway (Tanzi and Bertram, 2005). Similarly in Parkinsons disease, rare mutations in α-synuclein, LRRK2 and other genes can be causative of PD, though notably the G2019S mutation in LRRK2 has incomplete penetrance. In addition, duplications or triplications of α-synuclein can cause familial PD, and altered expression due to promoter variants may contribute to risk. By contrast, deletions in Parkin cause an early onset Parkinsonian syndrome (Hardy et al., 2006). Finally, much of PD may be due to genetic risk factors or environmental causes that have not yet been identified. Further studies will likely lead to the elucidation of pathogenic pathways. These diseases can provide a paradigm for the study of schizophrenia and other psychiatric diseases. One difference is that the copy number variations in the neurodegenerative diseases are often increases in copies (as in APP and α-synuclein), consistent with gain of function mechanisms, while the schizophrenia associations were predominantly with deletions, suggesting loss of function mechanisms. The hope is that as genes are identified, they can be linked together in pathways, leading to understanding of the neurobiology of schizophrenia (Ross et al., 2006).

The key unanswered questions, of course, are what genes or other functional domains are deleted at the chromosome 1, 15, and 22 loci, whether the deletions at these loci are sufficient in themselves to cause schizophrenia, and, if sufficient, the extent to which the deletions are penetrant. Both of the current studies identified deletions large enough to include several genes. The hope is that at least a subset of copy number variations (unlike SNP associations identified in schizophrenia to date) may be causative, making the identification of the relevant genes or other functional domains—at least in principle—more feasible.

Another tantalizing observation is that the copy number variations associated with schizophrenia were defined by flanking repeat regions. This raises the question of the extent to which undetected smaller insertions, deletions or other copy number variations related to other repetitive motifs, such as long tandem repeats, may also be associated with schizophrenia. Identification and testing of these loci may prove a fruitful approach to finding additional genetic risk factors for schizophrenia.

References:

Hardy J, Cai H, Cookson MR, Gwinn-Hardy K, Singleton A. Genetics of Parkinson's disease and parkinsonism. Ann Neurol. 2006 Oct;60(4):389-98. Abstract

Kirov G, Gumus D, Chen W, Norton N, Georgieva L, Sari M, O'Donovan MC, Erdogan F, Owen MJ, Ropers HH, Ullmann R. Comparative genome hybridization suggests a role for NRXN1 and APBA2 in schizophrenia. Hum Mol Genet . 2008 Feb 1 ; 17(3):458-65. Abstract

Porteous DJ, Thomson P, Brandon NJ, Millar JK. The genetics and biology of DISC1an emerging role in psychosis and cognition. Biol Psychiatry. 2006 Jul 15;60(2):123-31. Abstract

Ross CA, Margolis RL, Reading SA, Pletnikov M, Coyle JT. Neurobiology of schizophrenia. Neuron. 2006 Oct 5;52(1):139-53. Abstract

Singleton A, Myers A, Hardy J. The law of mass action applied to neurodegenerative disease: a hypothesis concerning the etiology and pathogenesis of complex diseases. Hum Mol Genet. 2004 Apr 1;13 Spec No 1:R123-6. Abstract

Tanzi RE, Bertram L. Twenty years of the Alzheimer's disease amyloid hypothesis: a genetic perspective. Cell. 2005 Feb 25;120(4):545-55. Abstract

Walsh T, McClellan JM, McCarthy SE, Addington AM, Pierce SB, Cooper GM, Nord AS, Kusenda M, Malhotra D, Bhandari A, Stray SM, Rippey CF, Roccanova P, Makarov V, Lakshmi B, Findling RL, Sikich L, Stromberg T, Merriman B, Gogtay N, Butler P, Eckstrand K, Noory L, Gochman P, Long R, Chen Z, Davis S, Baker C, Eichler EE, Meltzer PS, Nelson SF, Singleton AB, Lee MK, Rapoport JL, King MC, Sebat J. Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science. 2008 Apr 25;320(5875):539-43. Abstract

Xu B, Roos JL, Levy S, van Rensburg EJ, Gogos JA, Karayiorgou M. Strong association of de novo copy number mutations with sporadic schizophrenia. Nat Genet. 2008 Jul;40(7):880-5. Abstract

View all comments by Christopher Ross
View all comments by Russell L. Margolis

Related News: More Evidence for CNVs in Schizophrenia Etiology—Jury Still Out on Practical Implications

Comment by:  Daniel Weinberger, SRF Advisor
Submitted 3 August 2008
Posted 3 August 2008

Several recent reports have suggested that rare CNVs may be highly penetrant genetic factors in the pathogenesis of schizophrenia, perhaps even singular etiologic events in those cases of schizophrenia who have them. This is potentially of enormous importance, as the definitive identification of such a causative factor may be a major step in unraveling the biologic mystery of the condition. I would stress several issues that need to be considered in putting these recent findings into a broader perspective.

It is very difficult to attribute illness to a private CNV, i.e., one found only in a single individual. This point has been potently illustrated by a study of clinically discordant MZ twins who share CNVs (Bruder et al., AJHG, 2008). Inherited CNVs, such as those that made up almost all of the CNVs described in the childhood onset cases of the study by Walsh et al. (Science, 2008), are by definition not highly penetrant (since they are inherited from unaffected parents). The finding by Xu et al. (Nat Gen, 2008) that de novo (i.e., non-inherited) CNVs are much more likely to be associated with cases lacking a family history is provocative but difficult to interpret as no data are given about the size of the families having a family history and those not having such a history. Unless these family samples are of comparable size and obtained by a comparable ascertainment strategy, it is hard to know how conclusive the finding is. Indeed, in the study of Walsh et al., rare CNVs were just as likely to be found in patients with a positive family history. Finally, in contrast to private CNVs, recurrent (but still rare) CNVs, such as those identified on 1q and 15q in the studies of the International Schizophrenia Consortium (Nature, 2008) and Stefansson et al. (Nature, 2008), are strongly implicated as being associated with the diagnosis of schizophrenia and therefore likely involved in the causation of the illnesses in the cases having these CNVs. In all, these new CNV regions, combined with the VCFS region on 22q, suggest that approximately five to 10 patients out of 1,000 who carry the diagnosis of schizophrenia may have a well-defined genetic lesion (i.e., a substantial deletion or duplication).

The overarching question now is how relevant these findings are to the other 99 percent of individuals with this diagnosis who do not have these recurrent CNVs. Before we had the capability to perform high-density DNA hybridization and SNP array analyses, chromosomal anomalies associated with the diagnosis of schizophrenia were identified using cytogenetic techniques. Indeed, VCFS, XXX, XXY (Kleinfelters syndrome), and XO (Turner syndrome) have been found with similarly increased frequency in cases with this diagnosis in a number of studies. Now that we have greater resolution to identify smaller structural anomalies, the list of congenital syndromes that increase the possibility that people will manifest symptoms that earn them this diagnosis appears to be growing rapidly. Are we finding causes for the form of schizophrenia that most psychiatrists see in their offices, or are we instead carving out a new set of rare congenital syndromes that share some clinical characteristics, as syphilis was carved out from the diagnosis of schizophrenia at the turn of the twentieth century? Is schizophrenia a primary expression of these anomalies or a secondary manifestation? VCFS is associated with schizophrenia-like phenomena but even more often with mild mental retardation, autism spectrum, and other psychiatric manifestations. The same is true of the aneuploidies that increase the probability of manifesting schizophrenia symptoms. The two new papers in Nature allude to the possibility that epilepsy and intellectual limitations may also be associated with these CNVs. The diagnostic potential of any of these new findings cannot be determined until the full spectrum of their clinical manifestations is clarified.

One of the important insights that might emerge from identification of these new CNV syndromes is the identification of candidate genes that may show association with schizophrenia based on SNPs in these regions. VCFS has been an important source of promising candidate genes with broader clinical relevance (e.g., PRODH, COMT). Stefansson et al. report, however, that none of the 319 SNPs in the CNV regions showed significant association with schizophrenia in quite a large sample of individuals not having deletions in these regions. The Consortium report also presumably has the results of SNP association testing in these regions in their large sample but did not report them. It is very important to explore in greater genetic detail these regions of the genome showing association with the diagnosis of schizophrenia in samples lacking these lesions and to fully characterize the clinical picture of individuals who have them. It is hoped that insights into the pathogenesis of symptoms related to this diagnosis will emerge from these additional studies.

Anyone who has worked in a public state hospital or chronic schizophrenia care facility (where I spent over 20 years) is not surprised to find an occasional patient with a rare congenital or acquired syndrome who expresses symptoms similar to those individuals also diagnosed with schizophrenia who do not have such rare syndromes. Our diagnostic procedures are not precise, and the symptoms that earn someone this diagnosis are not specific. Schizophrenia is not something someone has; it is a diagnosis someone is given. In an earlier comment for SRF on structural variations in the genome related to autism, I suggested that, From a genetic point of view, autism is a syndrome that can be reached from many directions, along many paths. It is not likely that autism is any more of a discrete disease entity than say, blindness or mental retardation. These new CNV syndromes manifesting schizophrenia phenomena are probably a reminder that the same is true of what we call schizophrenia.

View all comments by Daniel Weinberger

Related News: Copy-number Variants, Interacting Alleles, or Both?

Comment by:  David J. Porteous, SRF Advisor
Submitted 11 February 2009
Posted 12 February 2009

The answer is unequivocally, yes
In co-highlighting the papers from Need et al., 2009, and Tomppo et al., 2009, you pose the question CNVs, interacting loci or both? to which my immediate answer is an unequivocal yes, but it actually goes further than that. These two studies, interesting in their own rights, add just two more pieces of evidence now accumulated from case only, case-control, and family-based linkage on the genetic architecture of schizophrenia. Thus, we can reject with confidence a single evolutionary and genetic origin for schizophrenia. If it were so, it would have been found already by the plethora of genomewide studies now completed, studies specifically designed to detect causal variants, should they exist, which are both common to most if not all subjects and ancient in origin—the Common Disease, Common Variant (CDCV) hypothesis.

Moreover, for DISC1, NRG1, NRXN1, and a few others, the criteria for causality are met in some subjects, but none of these is the sole cause of schizophrenia. Their net contributions to individual and population risk remain uncertain and await large scale resequencing as well as SNP and CNV studies to capture the totality of genetic variation and how that contributes to the incidence of major mental illness. Meanwhile, nosological and epidemiological evidence has also forced a re-evaluation of the categorical distinction between schizophrenia and bipolar disorder, let alone schizoaffective disorder (Lichtenstein et al., 2009).

In this regard, DISC1 serves again as an instructive paradigm, with good evidence for genetic association to schizophrenia, BP, schizoaffective disorder, and beyond (Chubb et al., 2008). The study by Hennah et al. (2008) added a further nuance to the DISC1 story by demonstrating intra-allelic interaction. Tomppo et al. (2009) now build upon their earlier evidence to show that DISC1 variants affect subcomponents of the psychiatric phenotype, treated now as a quantitative than a dichotomous trait. In much the same way and just as would be predicted, DISC1 variation also contributes to normal variation in human brain development and behavior (e.g., Callicott et al., 2005). Self-evidently, different classes of genetic variants (SNP or CNV, regulatory or coding) will have different biological and therefore psychiatric consequences (Porteous, 2008).

That Need et al. (2009) failed to replicate previous genomewide association studies (or find support for DISC1, NRG1, and the rest) is just further proof, if any were needed, that there is extensive genetic heterogeneity and that common variants of ancient origin are not major determinants of individual or population risk (Porteous, 2008). Variable penetrance, expressivity, and gene-gene interaction (epistasis) all need to be considered, but these intrinsic aspects of genetic influence are best addressed by family studies (currently lacking for CNV studies) and poorly addressed by large-scale case-control genomewide association studies. Power to test the CDCV hypothesis may increase with increasing numbers of subjects, but so does the inherent heterogeneity, both genetic and diagnostic.

That said, genetics is without doubt the most incisive tool we have to dissect the etiology of major mental illness. The criteria for success (and certainly for causality, rather than mere correlation) must be less about the number of noughts after the p and much more about the connection between candidate gene, gene variant, and the biological consequences for brain development and function. In this regard, both studies have something to say and offer.

References:

Lichtenstein P, Yip BH, Bjrk C, Pawitan Y, Cannon TD, Sullivan PF, Hultman CM. Common genetic determinants of schizophrenia and bipolar disorder in Swedish families: a population-based study. 2009 Lancet 373:234-9. Abstract

Chubb JE, Bradshaw NJ, Soares DC, Porteous DJ, Millar JK. Mol Psychiatry. The DISC locus in psychiatric illness. 2008 Jan;13(1):36-64. Epub 2007 Oct 2. Abstract

Callicott JH, Straub RE, Pezawas L, Egan MF, Mattay VS, Hariri AR, Verchinski BA,Meyer-Lindenberg A, Balkissoon R, Kolachana B, Goldberg TE, Weinberger DR. Variation in DISC1 affects hippocampal structure and function and increases risk for schizophrenia. 2005 Proc Natl Acad Sci U S A. 2005 102:8627-32. Abstract

Porteous D. Genetic causality in schizophrenia and bipolar disorder: out with the old and in with the new. 2008 Curr Opin Genet Dev. 18:229-34. Abstract

View all comments by David J. Porteous

Related News: Copy-number Variants, Interacting Alleles, or Both?

Comment by:  Pamela DeRosseAnil Malhotra (SRF Advisor)
Submitted 19 February 2009
Posted 22 February 2009

The results reported by Tomppo et al. and Need et al. collectively instantiate the complexities of the genetic architecture underlying risk for psychiatric illness. Paradoxically, however, while the results of Need et al. suggest that the answer to the complex question of risk genes for schizophrenia (SZ) may be found by searching a very select population for rare changes in genetic sequence, the results of Tomppo et al. suggest that the answer may be found by searching for common variants in large heterogeneous populations. So which is it? Is SZ the result of rare, novel genetic mutations or an accumulation of common ones? Such a conundrum is not a novel predicament in the process of scientific inquiry and such conundrums are often resolved by the reconciliation of both opposing views. Thus, if we allow history to serve as our guide it seems reasonable that the answer to the current question of what genetic mechanisms are responsible for SZ, is that SZ is caused by both rare and common variants.

Although considerable efforts, by our lab and others, are currently being directed towards seeking the type of rare variants that Need et al. suggest may be responsible for risk for SZ, a less concerted effort is being directed towards parsing the effects of more specific, common genetic variations. To date, there are limited data seeking to elucidate the effects of previously identified risk variants for SZ on phenotypic variation within the diagnostic group. The data that are available, however, suggest that risk variants do influence phenotypic variation. Our work with DISC1, for example, has produced relatively robust, and replicated findings linking variation in the gene to cognitive dysfunction (Burdick et al., 2005) as well as an increased risk for persecutory delusions in SZ (DeRosse et al., 2007). Similarly, our work with DTNBP1 indicates a strong association between variants in the gene and both cognitive dysfunction (Burdick et al., 2006) and negative symptoms in SZ (DeRosse et al., 2006). Moreover, the risk for cognitive dysfunction associated with the DTNBP1 risk genotype was also observed in a sample of healthy individuals. Thus, it seems conceivable that genetic variation associated with phenotypic variation within a diagnostic group may also be associated with similar, sub-syndromal phenotypes in non-clinical samples.

The data reported by Tomppo et al. provide support for the utility of parsing the specific effects of genetic variants on phenotypic variation and extend this approach to populations with sub-syndromal psychiatric symptoms. Such an approach is attractive in that it allows us to study the effects of genotype on phenotype without the confound imposed by psychotropic medications. Although the current data linking genes to specific dimensions of psychiatric illness are provocative, the study groups utilized are comprised of patients undergoing varying degrees of pharmacological intervention. Thus, in these analyses quantitative assessment of psychosis is to some extent confounded by treatment history and response. By measuring lifetime history of symptoms, which for most patients includes substantial periods without effective medication, many studies (including our own) may partially overcome this limitation. Still, assessment of the relation between genetic variation and dimensions of psychosis in study groups not undergoing treatment with pharmacological agents would be a compelling source of confirmation for these preliminary findings.

Perhaps most importantly, the data reported by Tomppo et al. suggest that previously identified risk genes should not be marginalized but rather, should be studied in non-clinical samples to identity phenotypic variation that may be related to the signs and symptoms of psychiatric illness.

References:

Burdick KE, Hodgkinson CA, Szeszko PR, Lencz T, Ekholm JM, Kane JM, Goldman D, Malhotra AK. DISC1 and neurocognitive function in schizophrenia. Neuroreport. 2005; 16(12):1399-402. Abstract

Burdick KE, Lencz T, Funke B, Finn CT, Szeszko PR, Kane JM, Kucherlapati R, Malhotra AK. Genetic variation in DTNBP1 influences general cognitive ability. Hum Mol Genet. 2006; 15(10):1563-8. Abstract

DeRosse P, Hodgkinson CA, Lencz T, Burdick KE, Kane JM, Goldman D, Malhotra AK. Disrupted in schizophrenia 1 genotype and positive symptoms in schizophrenia. Biol Psychiatry. 2007; 61(10):1208-10. Abstract

DeRosse P, Funke B, Burdick KE, Lencz T, Ekholm JM, Kane JM, Kucherlapati R, Malhotra AK. Dysbindin genotype and negative symptoms in schizophrenia. Am J Psychiatry. 2006; 163(3):532-4. Abstract

View all comments by Pamela DeRosse
View all comments by Anil Malhotra

Related News: Copy-number Variants, Interacting Alleles, or Both?

Comment by:  James L. Kennedy, SRF Advisor (Disclosure)
Submitted 25 February 2009
Posted 25 February 2009

Has anyone considered the possibility that the CNVs found to be elevated in schizophrenia versus controls could be a peripheral effect and perhaps not present in brain tissue? For example, the diet of the typical schizophrenia patient is poor, and it is conceivable that chronic folate deficiency could predispose to problems in DNA structure or repair in lymphocytes. Thus, the CNVs could be an effect of the illness, and not a cause. Someone needs to do the experiment that compares CNVs in blood to those in the brain of the same individual. And then we need studies of the stability of CNVs over the lifetime of an individual.

View all comments by James L. Kennedy

Related News: Copy-number Variants, Interacting Alleles, or Both?

Comment by:  Kevin J. Mitchell
Submitted 2 March 2009
Posted 2 March 2009

The papers by Need et al. and Tomppo et al. seem to present conflicting evidence for the involvement of common or rare variants in the etiology of schizophrenia.

On the one hand, Need et al., in a very large and well-powered sample, find no evidence for involvement of any common SNPs or CNVs. Importantly, they show that while any one SNP with a small effect and modest allelic frequency might be missed by their analysis, the likelihood that all such putative SNPs would be missed is vanishingly small. They come to the reasonable conclusion that common variants are unlikely to play a major role in the etiology of schizophrenia, except under a highly specific and implausible genetic model. Does this sound the death knell for the common variants, polygenic model of schizophrenia? Yes and no. These and other empirical data are consistent with theoretical analyses which show that the currently popular purely polygenic model, without some gene(s) of large effect, cannot explain familial risk patterns (Hemminki et al., 2007; Hemminki et al., 2008; Bodmer and Bonilla, 2008). It has been suggested that epistatic interactions may generate discontinuous risk from a continuous distribution of common alleles; however, while comparisons of risk in monozygotic and dizygotic twins are consistent with some contribution from epistasis, they are not consistent with the massive levels that would be required to rescue a purely polygenic mechanism, whether through a multiplicative or (biologically unrealistic) threshold model.

Thus, it seems most parsimonious to conclude that most cases of schizophrenia will involve a variant of large effect. As such variants are likely to be rapidly selected against, they are also likely to be quite rare. The findings of specific, gene-disrupting CNVs or mutations in individual genes in schizophrenia cases by Need et al. and numerous other groups support this idea. Excitingly, they also have highlighted specific molecules and biological pathways that provide molecular entry points to elucidate pathogenic mechanisms. The possible convergence on genes interacting with DISC1, including PCM1 and NDE1 in the current study, provides further support for the importance of this pathway, though, clearly, there may be many other ways to disrupt neural development or function that could lead to schizophrenia. (Conversely, it is becoming clearer that many of the putative causative mutations identified so far predispose to multiple psychiatric or neurological conditions.)

Despite the likely involvement of rare variants in most cases of schizophrenia, it remains possible that common alleles could have a modifying influence on risk—indeed, one early paper commonly cited as supporting a polygenic model for schizophrenia actually provided strong support for a model of a single gene of large effect and two to three modifiers (Risch, 1990). A rare variants/common modifiers model would be consistent with the body of literature on modifying genes in model organisms, where effects of genetic background on the phenotypic expression of particular mutations are quite common and can sometimes be large (Nadeau, 2001). Whether such genetic background effects would be mediated by common or rare variants is another question—there is certainly good reason to think that rare or even private mutations may make a larger contribution to phenotypic variance than previously suspected (Ng et al., 2008; Ji et al., 2008).

Nevertheless, common variants are also likely to be involved, and these effects might be detectable in large association studies, though they would be expected to be diluted across genotypes. This might explain inconsistent findings of association of common variants with disease state for various genes, including COMT, BDNF, and DISC1, for example. This issue has led some to look for association of variants in these genes with endophenotypes of schizophrenia in the general population—psychological or physiological traits that are heritable and affected by the symptoms of the disease, such as working memory, executive function, or, in the study by Tomppo et al., social interaction.

These approaches have tended to lead to statistically stronger and more consistent associations and are undoubtedly revealing genes and mechanisms contributing to normal variation in many psychological traits. How this relates to their potential involvement in disease etiology is far from clear, however. The implication of the endophenotype model is that the disorder itself emerges due to the combination of minor effects on multiple symptom parameters (Gottesman and Gould, 2003; Meyer-Lindenberg and Weinberger, 2006). An alternative interpretation is that these common variants may modify the phenotypic expression of some other rare variant, either due to their demonstrated effect on the psychological trait in question or through a more fundamental biochemical interaction, but that in the absence of such a variant of large effect, no combination of common alleles would lead to disease.

References:

Hemminki K, Frsti A, Bermejo JL. The 'common disease-common variant' hypothesis and familial risks. PLoS ONE. 2008 Jun 18;3(6):e2504. Abstract

Hemminki K, Bermejo JL. Constraints for genetic association studies imposed by attributable fraction and familial risk. Carcinogenesis. 2007 Mar;28(3):648-56. Abstract

Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008 Jun;40(6):695-701. Abstract

Risch N. Linkage strategies for genetically complex traits. I. Multilocus models. Am J Hum Genet. 1990 Feb;46(2):222-8. Abstract

Nadeau JH. Modifier genes in mice and humans. Nat Rev Genet. 2001 Mar;2(3):165-74. Abstract

Ng PC, Levy S, Huang J, Stockwell TB, Walenz BP, Li K, Axelrod N, Busam DA, Strausberg RL, Venter JC. Genetic variation in an individual human exome. PLoS Genet. 2008 Aug 15;4(8):e1000160. Abstract

Ji W, Foo JN, O'Roak BJ, Zhao H, Larson MG, Simon DB, Newton-Cheh C, State MW, Levy D, Lifton RP. Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet. 2008 May;40(5):592-9. Abstract

Gottesman II, Gould TD. The endophenotype concept in psychiatry: etymology and strategic intentions. Am J Psychiatry. 2003 Apr;160(4):636-45. Abstract

Meyer-Lindenberg A, Weinberger DR. Intermediate phenotypes and genetic mechanisms of psychiatric disorders. Nat Rev Neurosci. 2006 Oct;7(10):818-27. Abstract

View all comments by Kevin J. Mitchell

Related News: Large Family Study Links Genetics of Schizophrenia, Bipolar Disorder

Comment by:  Alastair Cardno
Submitted 7 April 2009
Posted 7 April 2009
  I recommend the Primary Papers

The results of the family/adoption study by Lichtenstein et al. (2009) and our twin study (Cardno et al., 2002) are remarkably similar. Using a non-hierarchical diagnostic approach, the genetic correlation between schizophrenia and bipolar/mania was 0.60 in the family/twin study and 0.68 in the twin study. The heritability estimates were somewhat lower in the family/adoption (~60 percent) than twin study (~80 percent), but can still be said to be substantial and similar for both disorders.

When we adopted a hierarchical approach, with schizophrenia above mania, we found no monozygotic twin pairs where one twin had schizophrenia and the other had bipolar/mania, but with their considerably larger sample, Lichtenstein et al. (2009) were able to confirm a significantly elevated risk for bipolar disorder in siblings of probands with schizophrenia (RR = 2.7), even when individuals with co-occurrence of both disorders were excluded.

I think there is a potentially interesting link between the family/adoption and twin studies focusing mainly on non-hierarchical diagnoses: Owen and Craddocks (2009) commentary on the family/adoption study, where they advocate a dimensional approach, and Will Carpenters SRF comment regarding the value of domains of psychopathology. The non-hierarchical approach (where individuals can have a diagnosis of both schizophrenia and bipolar disorder during their lifetime) could be viewed as a form of dimensional/domains of psychopathology approach, with schizophrenia and bipolar disorder each having a dimension of liability, and there is now evidence from family, twin, and adoption analyses that these dimensions are correlated, i.e., that there is some overlap in etiological influences.

If schizophrenia and bipolar disorder share some causal factors in common, what might be the implications for the unresolved status of schizoaffective disorder? Our twin study suggested that the genetic (but not environmental) liability to schizoaffective disorder is entirely shared with schizophrenia and mania, defined non-hierarchically (Cardno et al., 2002). If so, and if schizophrenia and bipolar disorder share some genetic susceptibility loci in common, while other loci are not shared, then risk of schizoaffective disorder (or perhaps the bipolar subtype) could be elevated either by the coincidental co-occurrence of non-shared susceptibility loci, or by the occurrence of loci that are common to both disorders.

In this case, any loci that influence risk of schizoaffective disorder (bipolar subtype?) should also increase risk of schizophrenia and/or bipolar disorder, and this model would be refuted if any relatively specific susceptibility loci for schizoaffective disorder were confidently identified.

Some further outstanding issues:



References:

Cardno AG, Rijsdijk FV, Sham PC, Murray RM, McGuffin P. A twin study of genetic relationships between psychotic symptoms. American Journal of Psychiatry 2002;159:539-545. Abstract

Lichtenstein P, Yip BH, Bjrk C, Pawitan Y, Cannon TD, Sullivan PF, Hultman CM. Common genetic determinants of schizophrenia and bipolar disorder in Swedish families: a population-based study. Lancet 2009;373:234-9. Abstract

Owen MJ, Craddock N. Diagnosis of functional psychoses: time to face the future. Lancet 2009;373:190-191. Abstract

View all comments by Alastair Cardno

Related News: Schizophrenia Genetics 2: The Rise of GWAS

Comment by:  Chris Carter
Submitted 7 April 2010
Posted 8 April 2010

I wonder whether the relative lack of success in schizophrenia GWAS may be because the origin of schizophrenia may lie not so much in the genetic make-up of people with schizophrenia themselves, but in their prenatal experience, and possibly with the genes of the mother rather than with those of the offspring. Famine, rubella, influenza, herpes (HSV1 and HSV2), and poliovirus infection as well as high fever during pregnancy have all been listed as risk factors for the offspring developing schizophrenia in later life, as have maternal preeclampsia and obstetric complications. (See page at Polygenic Pathways for the many references.)

Maternal resistance to these effects is likely to be gene-dependent. Is it worth considering GWAS in the mothers rather than in the offspring?

View all comments by Chris Carter

Related News: GWAS Goes Bigger: Large Sample Sizes Uncover New Risk Loci, Additional Overlap in Schizophrenia and Bipolar Disorder

Comment by:  David J. Porteous, SRF Advisor
Submitted 21 September 2011
Posted 21 September 2011

Consorting with GWAS for schizophrenia and bipolar disorder: same message, (some) different genes
On 18 September 2011, Nature Genetics published the results from the Psychiatric Genetics Consortium of two separate, large-scale GWAS analyses, for schizophrenia (Ripke et al., 2011) and for bipolar disorder (Sklar et al., 2011), and a joint analysis of both. By combining forces across several consortia who have previously published separately, we should now have some clarity and definitive answers.

For schizophrenia, the Stage 1 GWAS discovery data came from 9,394 cases and 12,462 controls from 17 studies, imputing 1,252,901 SNPs. The Stage 2 replication sample comprised 8,442 cases and 21,397 controls. Of the 136 SNPs which reached genomewide significance in Stage 1, 129 (95 percent) mapped to the MHC locus, long known to be associated with risk of schizophrenia. Of the remaining seven SNPs, five mapped to previously identified loci. In total, just 10 loci met or exceeded the criteria of genomewide significance of p <5 x 10-8 at Stage 1 and/or Stage 2. The 10 "best" SNPs identified eight loci: MIR137, TRIM26, CSM1, CNNM2, NT5C2 and TCF4 were tagged by intragenic SNPs, while the remaining two were at some distance from a known gene (343 kb from PCGEM1 and 126 kb from CCDC68). More important than the absolute significance levels, the overall odds ratios (with 95 percent confidence intervals) ranged from 1.08 (0.96-1.20) to 1.40 (1.28-1.52). These fractional increases contrast with the ~10-fold increase in risk to the first-degree relative of someone with schizophrenia (Gottesman et al., 2010).

Six of these eight loci have been reported previously, but ZNF804A, a past favorite, was noticeably absent from the "top 10" list. The main attention now will surely be on MIR137, a newly discovered locus which encodes a microRNA, mir137, known to regulate neuronal development. The authors remark that 17 predicted MIR137 targets had a SNP with a p <10-4, more than twice as many as for the control gene set (p <0.01), though this relaxed significance cutoff seems somewhat arbitrary and warrants further examination. The result for MIR137 immediately begs the questions, Does the "risk" SNP affect MIR137 function directly or indirectly, and if so, does it affect the expression of any of the putative targets identified here? These are fairly straightforward questions: positive answers are vital to the biological validation of these statistical associations. As has been the case for follow-up studies of ZNF804A, however (reviewed by Donohoe et al., 2010), unequivocal answers from GWAS "hits" can be hard to come by, not least because of the very modest relative risks that they confer. Let us hope that this is not the case for MIR137, but it is of passing note that for two of the eight replication cohorts, the direction of effect for MIR137 was in the opposite direction from the Stage 1 finding. Taken together with the odds ratios reported in the range of 1.11-1.22, the effect size for the end phenotype of schizophrenia may be challenging to validate functionally. Perhaps a relevant intermediate phenotype more proximal to the gene will prove tractable.

For bipolar disorder, Stage 1 comprised 7,481 cases versus 9,250 controls, and identified 34 promising SNPs. These were replicated in Stage 2 in an independent set of 4,496 cases and a whopping 42,422 controls: 18 of the 34 SNPs survived at p <0.05. Taking Stage 1 and 2 together confirmed the previous "hot" finding for CACNA1C (Odds ratio = 1.14) and introduced a new candidate in ODZ4 (Odds ratio = 0.88, i.e., the minor allele is presumably "protective" or under some form of selection). Previous candidates ANK3 and SYNE1 looked promising at Stage 1, but did not replicate at Stage 2.

Finally, in a combined analysis of schizophrenia plus bipolar disorder versus controls, three of the respective "top 10" loci, CACNA1C, ANK3, and the ITIH3-ITIH4 region, came out as significant overall. This is consistent with the earlier evidence from the ISC for an overlap between the polygenic index for schizophrenia and bipolar disorder (Purcell et al., 2009). It is also consistent with the epidemiological evidence for shared genetic risk between schizophrenia and bipolar disorder (Lichtenstein et al., 2009; Gottesman et al., 2010).

What can we take from these studies? The authorship lists alone speak to the size of the collaborative effort involved and the sheer organizational task, depending on your point of view, that most of the positive findings were reported on previously could be seen as valuable "replication," or unnecessary duplication of cost and effort. Whichever way you look at it, though, just two new loci for schizophrenia and one for bipolar looks like a modest return for such a gargantuan investment. It begs the question as to whether the GWAS approach is gaining the hoped-for traction on major mental illness. Indeed, the evidence suggests that the technology tide is rapidly turning away from allelic association methods and towards rare mutation detection by copy number variation, exome, and/or whole-genome sequencing (Vacic et al., 2011; Xu et al., 2011).

Family studies are, as ever and always, of critical importance in genetics, and to distinguish between inherited and de-novo mutations. While the emphasis of GWAS has been on the impact of common, ancient allelic variation, it has become ever more obvious from both past linkage studies and from contemporary GWAS and CNV studies just how heterogeneous these conditions are, and how little note individual cases and families take of conventional DSM diagnostic boundaries. Improved genetic and other tools through which to stratify risk, define phenotypes, and predict outcomes are clearly needed. Whether such tools can be derived for GWAS data remains to be seen. It is important to remind ourselves of two things. First, case/association studies tell us something about the average impact (odds ratio, with confidence interval) of a given allele in the population studied. In these very large GWAS, this measure of impact will be approximating to the European population average. The odds ratios tell us that the impact per allele is modest. More importantly in some ways, the allele frequencies also tell us that the vast majority of allele carriers are not affected. Likewise, a high proportion of cases are not carriers. In the main, they are subtle risk modifiers rather than causal variants. That said, follow-up studies may define rare, functional genetic variants in MIR137 or CACNA1C or ANK3 that are tagged by the risk allele and that have sufficiently strong effects in a subset of cases for a causal link to be made. With this new GWAS data in hand, these sorts of questions can now be addressed.

It should also be said that there is clearly a wealth of potentially valuable information lying below the surface of the most statistically significant findings, but how to sort the true from the false associations? Should the MIR137 finding, and the targets of MIR137, be substantiated by biological analysis, then that would certainly be something well worth knowing and following up on. Network analysis by gene ontology and protein-protein interaction may yield more, but these approaches need to be approached with caution when not securely anchored from a biologically validated start point. Epistasis and pleiotropy are most likely playing a role, but even in these large sample sets, the power to determine statistical (as opposed to biological) evidence is challenging. All told, one is left thinking that more incisive findings have and will in the future come from family-based approaches, through structural studies (CNVs and chromosome translocations), and, in the near future, whole-genome sequencing of cases and relatives.

References:

Ripke S, Sanders AR, Kendler KS, Levinson DF, Sklar P, Holmans PA, Lin DY, Duan J, Ophoff RA, Andreassen OA, Scolnick E, Cichon S, St Clair D, Corvin A, Gurling H, Werge T, Rujescu D, Blackwood DH, Pato CN, Malhotra AK, Purcell S, Dudbridge F, Neale BM, Rossin L, Visscher PM, Posthuma D, Ruderfer DM, Fanous A, Stefansson H, Steinberg S, Mowry BJ, Golimbet V, de Hert M, Jnsson EG, Bitter I, Pietilinen OP, Collier DA, Tosato S, Agartz I, Albus M, Alexander M, Amdur RL, Amin F, Bass N, Bergen SE, Black DW, Brglum AD, Brown MA, Bruggeman R, Buccola NG, Byerley WF, Cahn W, Cantor RM, Carr VJ, Catts SV, Choudhury K, Cloninger CR, Cormican P, Craddock N, Danoy PA, Datta S, de Haan L, Demontis D, Dikeos D, Djurovic S, Donnelly P, Donohoe G, Duong L, Dwyer S, Fink-Jensen A, Freedman R, Freimer NB, Friedl M, Georgieva L, Giegling I, Gill M, Glenthj B, Godard S, Hamshere M, Hansen M, Hansen T, Hartmann AM, Henskens FA, Hougaard DM, Hultman CM, Ingason A, Jablensky AV, Jakobsen KD, Jay M, Jrgens G, Kahn RS, Keller MC, Kenis G, Kenny E, Kim Y, Kirov GK, Konnerth H, Konte B, Krabbendam L, Krasucki R, Lasseter VK, Laurent C, Lawrence J, Lencz T, Lerer FB, Liang KY, Lichtenstein P, Lieberman JA, Linszen DH, Lnnqvist J, Loughland CM, Maclean AW, Maher BS, Maier W, Mallet J, Malloy P, Mattheisen M, Mattingsdal M, McGhee KA, McGrath JJ, McIntosh A, McLean DE, McQuillin A, Melle I, Michie PT, Milanova V, Morris DW, Mors O, Mortensen PB, Moskvina V, Muglia P, Myin-Germeys I, Nertney DA, Nestadt G, Nielsen J, Nikolov I, Nordentoft M, Norton N, Nthen MM, O'Dushlaine CT, Olincy A, Olsen L, O'Neill FA, Orntoft TF, Owen MJ, Pantelis C, Papadimitriou G, Pato MT, Peltonen L, Petursson H, Pickard B, Pimm J, Pulver AE, Puri V, Quested D, Quinn EM, Rasmussen HB, Rthelyi JM, Ribble R, Rietschel M, Riley BP, Ruggeri M, Schall U, Schulze TG, Schwab SG, Scott RJ, Shi J, Sigurdsson E, Silverman JM, Spencer CC, Stefansson K, Strange A, Strengman E, Stroup TS, Suvisaari J, Terenius L, Thirumalai S, Thygesen JH, Timm S, Toncheva D, van den Oord E, van Os J, van Winkel R, Veldink J, Walsh D, Wang AG, Wiersma D, Wildenauer DB, Williams HJ, Williams NM, Wormley B, Zammit S, Sullivan PF, O'Donovan MC, Daly MJ, Gejman PV. Genome-wide association study identifies five new schizophrenia loci. Nat Genet . 2011 Sep 18. Abstract

Psychiatric GWAS Consortium Bipolar Disorder Working Group, Sklar P, Ripke S, Scott LJ, Andreassen OA, Cichon S, Craddock N, Edenberg HJ, Nurnberger JI Jr, Rietschel M, Blackwood D, Corvin A, Flickinger M, Guan W, Mattingsdal M, McQuillin A, Kwan P, Wienker TF, Daly M, Dudbridge F, Holmans PA, Lin D, Burmeister M, Greenwood TA, Hamshere ML, Muglia P, Smith EN, Zandi PP, Nievergelt CM, McKinney R, Shilling PD, Schork NJ, Bloss CS, Foroud T, Koller DL, Gershon ES, Liu C, Badner JA, Scheftner WA, Lawson WB, Nwulia EA, Hipolito M, Coryell W, Rice J, Byerley W, McMahon FJ, Schulze TG, Berrettini W, Lohoff FW, Potash JB, Mahon PB, McInnis MG, Zllner S, Zhang P, Craig DW, Szelinger S, Barrett TB, Breuer R, Meier S, Strohmaier J, Witt SH, Tozzi F, Farmer A, McGuffin P, Strauss J, Xu W, Kennedy JL, Vincent JB, Matthews K, Day R, Ferreira MA, O'Dushlaine C, Perlis R, Raychaudhuri S, Ruderfer D, Hyoun PL, Smoller JW, Li J, Absher D, Thompson RC, Meng FG, Schatzberg AF, Bunney WE, Barchas JD, Jones EG, Watson SJ, Myers RM, Akil H, Boehnke M, Chambert K, Moran J, Scolnick E, Djurovic S, Melle I, Morken G, Gill M, Morris D, Quinn E, Mhleisen TW, Degenhardt FA, Mattheisen M, Schumacher J, Maier W, Steffens M, Propping P, Nthen MM, Anjorin A, Bass N, Gurling H, Kandaswamy R, Lawrence J, McGhee K, McIntosh A, McLean AW, Muir WJ, Pickard BS, Breen G, St Clair D, Caesar S, Gordon-Smith K, Jones L, Fraser C, Green EK, Grozeva D, Jones IR, Kirov G, Moskvina V, Nikolov I, O'Donovan MC, Owen MJ, Collier DA, Elkin A, Williamson R, Young AH, Ferrier IN, Stefansson K, Stefansson H, Thornorgeirsson T, Steinberg S, Gustafsson O, Bergen SE, Nimgaonkar V, Hultman C, Landn M, Lichtenstein P, Sullivan P, Schalling M, Osby U, Backlund L, Frisn L, Langstrom N, Jamain S, Leboyer M, Etain B, Bellivier F, Petursson H, Sigur Sson E, Mller-Mysok B, Lucae S, Schwarz M, Schofield PR, Martin N, Montgomery GW, Lathrop M, Oskarsson H, Bauer M, Wright A, Mitchell PB, Hautzinger M, Reif A, Kelsoe JR, Purcell SM. Large-scale genome-wide association analysis of bipolar disorder reveals a new susceptibility locus near ODZ4. Nat Genet. 2011 Sep 18. Abstract

Lichtenstein P, Yip BH, Bjrk C, Pawitan Y, Cannon TD, Sullivan PF, Hultman CM. Common genetic determinants of schizophrenia and bipolar disorder in Swedish families: a population-based study. Lancet . 2009 Jan 17 ; 373(9659):234-9. Abstract

Gottesman II, Laursen TM, Bertelsen A, Mortensen PB. Severe mental disorders in offspring with 2 psychiatrically ill parents. Arch Gen Psychiatry . 2010 Mar 1 ; 67(3):252-7. Abstract

Donohoe G, Morris DW, Corvin A. The psychosis susceptibility gene ZNF804A: associations, functions, and phenotypes. Schizophr Bull . 2010 Sep 1 ; 36(5):904-9. Abstract

Purcell SM, Wray NR, Stone JL, Visscher PM, O'Donovan MC, Sullivan PF, Sklar P. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature . 2009 Aug 6 ; 460(7256):748-52. Abstract

Vacic V, McCarthy S, Malhotra D, Murray F, Chou HH, Peoples A, Makarov V, Yoon S, Bhandari A, Corominas R, Iakoucheva LM, Krastoshevsky O, Krause V, Larach-Walters V, Welsh DK, Craig D, Kelsoe JR, Gershon ES, Leal SM, Dell Aquila M, Morris DW, Gill M, Corvin A, Insel PA, McClellan J, King MC, Karayiorgou M, Levy DL, DeLisi LE, Sebat J. Duplications of the neuropeptide receptor gene VIPR2 confer significant risk for schizophrenia. Nature . 2011 Mar 24 ; 471(7339):499-503. Abstract

Xu B, Roos JL, Dexheimer P, Boone B, Plummer B, Levy S, Gogos JA, Karayiorgou M. Exome sequencing supports a de novo mutational paradigm for schizophrenia. Nat Genet . 2011 Jan 1 ; 43(9):864-8. Abstract

View all comments by David J. Porteous

Related News: GWAS Goes Bigger: Large Sample Sizes Uncover New Risk Loci, Additional Overlap in Schizophrenia and Bipolar Disorder

Comment by:  Patrick Sullivan, SRF Advisor
Submitted 26 September 2011
Posted 26 September 2011
  I recommend the Primary Papers

The two papers appearing online in Nature Genetics last Sunday are truly important additions to our increasing knowledge base for these disorders. The core analyses have been presented multiple times at international meetings in the past two years.

Since then, the available sample sizes for both schizophrenia and bipolar disorder have grown considerably. If the recently published data are any guide, the next round of analyses should be particularly revealing.

The PGC results and almost all of the data that were used in these reports are available by application to the controlled-access repository.

Please see the references for views of this area that contrast with those of Professor Porteous.

References:

Sullivan P. Don't give up on GWAS. Molecular Psychiatry. 2011 Aug 9. Abstract

Kim Y, Zerwas S, Trace SE, Sullivan PF. Schizophrenia genetics: where next? Schizophr Bull. 2011;37:456-63. Abstract

View all comments by Patrick Sullivan

Related News: GWAS Goes Bigger: Large Sample Sizes Uncover New Risk Loci, Additional Overlap in Schizophrenia and Bipolar Disorder

Comment by:  Edward Scolnick
Submitted 28 September 2011
Posted 29 September 2011
  I recommend the Primary Papers

It is clear in human genetics that common variants and rare variants have frequently been detected in the same genes. Numerous examples exist in many diseases. The bashing of GWAS in schizophrenia and bipolar illness indicates, by those who make such comments, a lack of understanding of human genetics and where the field is. When these studies were initiated five years ago, next-generation sequencing was not available. Large samples of populations or trios or quartets did not exist. The international consortia have worked to collect such samples that are available for GWAS now, as well as for detailed sequencing studies. Before these studies began there was virtually nothing known about the etiology of schizophrenia and bipolar illness. The DISC1 gene translocation in the famous family was an important observation in that family. But almost a decade later there is still no convincing data that variants in Disc1 or many of its interacting proteins are involved in the pathogenesis of human schizophrenia or major mental illness.

Sequencing studies touted to be the Occam's razor for the field are beginning, and already, as in the past in this field, preemptive papers are appearing inadequately powered to draw any conclusions with certainty. Samples collected by the consortia will be critical to clarify the role of rare variants. This will take time and care so as not to set the field back into the morass it used to be. GWAS are basically modern public health epidemiology providing important clues to disease etiology. Much work is clearly needed once hits are found, just as it has been in traditional epidemiology. But in many fields, GWAS has already led to important biological insights, and it is certain it will do so in this field as well because the underlying principles of human genetics apply to this field, also. The primary problem in the field is totally inadequate funding by government organizations that consistently look for shortcuts to gain insights and new treatments, and forget how genetics has transformed cancer, immunology, autoimmune and inflammatory diseases, and led to better diagnostics and treatments. The field will never understand the pathogenesis of these illnesses until the genetic architecture is deciphered. The first enzyme discovered in E. coli DNA biochemistry was a repair enzyme—not the enzyme that replicated DNA—and this was discovered through genetics. The progress in this field has been dramatic in the past five years. All doing this work realize that this is only a beginning and that there is a long hard road to full understanding. But to denigrate the beginning, which is clearly solid, makes no sense and indicates a provincialism unbecoming to a true scientist.

View all comments by Edward Scolnick

Related News: GWAS Goes Bigger: Large Sample Sizes Uncover New Risk Loci, Additional Overlap in Schizophrenia and Bipolar Disorder

Comment by:  Nick CraddockMichael O'Donovan (SRF Advisor)
Submitted 11 October 2011
Posted 11 October 2011

At the start of the millennium, only two molecular genetic findings could be said with a fair amount of confidence to be etiologically relevant to schizophrenia and bipolar disorder. The first of these was that deletions of chromosome 22q11 that are known to cause velo-cardio-facial syndrome also confer a substantial increase in risk of psychosis. The second was the discovery by David St Clair, Douglas Blackwood, and colleagues (St Clair et al., 1990) of a balanced translocation involving chromosomes 1 and 11 that co-segregates with a range of psychiatric phenotypes in a single large family, was clearly relevant to the etiology of illness in that family (Blackwood et al., 2001). The latter finding has led to the conjecture, based upon a translocation breakpoint analysis reported by Kirsty Millar, David Porteous, and colleagues (Millar et al., 2000), that elevated risk in that family is conferred by altered function of a gene eponymously named DISC1. Just over a decade later, what can we now say with similar degrees of confidence? The relevance of deletions of 22q11 has stood the test of time—indeed, has strengthened—through further investigation (Levinson et al., 2011, being only one example), while the relevance of DISC1 remains conjecture. That the evidence implicating this gene is no stronger than it was all those years ago provides a clear illustration of the difficulties inherent in drawing etiological inferences from extremely rare mutations regardless of their effect size.

However, with the publication of several GWAS and CNV papers, culminating in the two mega-analyses reported by the PGC that are the subject of this commentary, one on schizophrenia, one on bipolar disorder, together reporting a total of six novel loci, very strong evidence has accumulated for approximately 20 new loci in psychosis. The majority of these are defined by SNPs, the remainder by copy number variants, and virtually all (including the rare, relatively high-penetrance CNVs) have emerged through the application of GWAS technology to large case-control samples, not through the study of linkage or families. Have GWAS approaches proven their worth? Clearly, the genetic findings represent the tip of a very deeply submerged iceberg, and it is possible that not all will stand the test of time and additional data, although the current levels of statistical support suggest the majority will do so. Nevertheless, the findings of SNP and CNV associations (including 22q11 deletions) seem to us to provide the first real signs of progress in uncovering strongly supported findings of primary etiological relevance to these disorders. Although SNP effects are small, the experience from other complex phenotypes is that statistically robust genetic associations, even those of very small effect, can highlight biological pathways of etiological (height; Lango Allen et al., 2010) and of possible therapeutic relevance (Alzheimer's disease; Jones et al., 2010). Moreover, it would seem intuitively likely that even if capturing the total heritable component of a disorder is presently a distant goal, the greater the number of associations captured, the better will be the snapshot of the sorts of processes that contribute to a disorder, and that might therefore be manipulated in its treatment. Thus, there is evidence that building even a very incomplete picture of the sort of genes that influence risk is an excellent method of informing understanding of pathogenesis of a highly complex disorder (or set of disorders).

As in previous GWAS and CNV endeavors, the PGC studies have required a significant degree of altruism from the hundreds of investigators and clinicians who have shared their data with little hope of significant academic credit. Moreover, where ethical approval permitted, the datasets have been made virtually open source for other investigators who are not part of the study. Sadly, this generosity of spirit is not matched in the rather curmudgeonly commentary provided by David Porteous. Rather than challenging the science or conduct of the study, it appears to us that the commentary takes the easier route of damnation by faint praise, distortion, and even innuendo.

The strongest finding, that being of association to the extended MHC region, is dismissed as "long known to be associated with risk of schizophrenia." How that knowledge was acquired a long time ago is unclear, but it cannot have been based upon data. It is true that weak and inconsistent associations at the MHC locus have been reported, even predating the molecular genetic era (McGuffin et al., 1978), but not until the landmark studies of the International Schizophrenia Consortium (2009), the Molecular Genetics of Schizophrenia Consortium (AbstractShi et al., 2009), and the SGENE+ Consortium (Stefansson et al., 2009) have the findings been strong enough to be described as knowledge. Porteous dismissive tone continues with the phrase "just 10 loci met.," the word "just" being a qualifier that seems designed to denigrate rather than challenge the results. Given the paucity of etiological clues, others might consider this a good yield. The observation in which the effect sizes at the detected loci are contrasted "with the ~10-fold increase in risk to the first-degree relative of someone with schizophrenia" is so fatuous it is difficult to believe its function is anything other than to insinuate in the mind of the reader the impression of failure. Yet no one remotely aware of the expectations behind GWAS would expect that the effect sizes of any common risk allele would bear any resemblance to that of family history, the latter reflecting the combined effects of many risk alleles.

Among the most important findings of the PGC schizophrenia group were those of strong evidence for association between a variant in the vicinity of a gene encoding regulatory RNA MIR137, and the subsequent finding that schizophrenia association signals were significantly enriched (P <0.01) among predicted targets of this regulatory RNA. Of course, like the other findings, there is room for the already very strong data to be further strengthened, but that finding alone opens up a whole new window in potential pathogenic mechanisms. Yet Porteous casually throws four handfuls of mud, dismissing the enrichment p <0.01 as a "relaxed significance cutoff," which "seems somewhat arbitrary," and that "warrants further examination," and commenting that "it is of passing note that for two of the eight replication cohorts, the direction of effect for MIR137 was in the opposite direction from the Stage 1 finding." If Porteous feels he has the expertise to pronounce on this analysis, it would behoove him well to choose his words more carefully. Since when is a P value of <0.01 "relaxed" when applied to a test of a single hypothesis? Can he really be unaware of the longstanding convention of regarding P <0.05 as significant in specific hypothesis testing? If he is not unaware of this, why is it generally applicable but "somewhat arbitrary" in the context of the PGC study? As for "further examination being warranted," this is true of any scientific finding, but what does he specifically mean in the context of his commentary? And why is it of "passing note" that not all samples show trends in the same direction? In the context of the well-known issues in GWAS concerning individual small samples and power, what is surprising about that? There may be simple answers to these questions, but we find it difficult to draw any other conclusion than that the choice of language is anything other than another attempt to sow seeds of doubt through innuendo rather than analysis.

The remark that "ZNF804A, a past favourite, was noticeably absent" falls well short of the standard one might expect of serious discourse. The choice of language suggests a desire to denigrate rather than analyse, and to insinuate without specific evidence that any interest in this gene should now be over. In fact, the largest study of this gene to date is that of Williams et al. (2010), which actually includes at least two-thirds of the PGC discovery dataset and is based on over 57,000 subjects, a sample almost three times as large as the mega-analysis sample of the PGC.

Porteous overall conclusion from the two studies is "whichever way you look at it, though, just two new loci for schizophrenia and one for bipolar looks like a modest return for such a gargantuan investment." This appraisal is misleading. The PGC studies were actually relatively small investments, being based on a synthesis of pre-existing data. Since the studies use existing data, there is naturally an expectation that some of the loci identified will have been previously reported as either significant or have otherwise been flagged up as of interest, while some will be new. Overall, the return on the GWAS investment is not just the six novel loci (rather than three); it is the totality of the findings, which, as noted above, currently number about 20 loci. The schizophrenia research community should also be made aware, if they are not already, that the return on these investments is not "one off"; it is cumulative. In the coming years, the component datasets will continue to generate a return in new gene discoveries (including CNVs yet to be reported by the PGC) as they are added (at essentially no cost) to other emerging GWAS datasets being generated largely through charitable support. With the returns in the bank already, one could (and we do) argue that the investment is negligible, particularly given the cost in human and economic terms of continued ignorance about these illnesses that blight so many lives.

It is true that with so little being known compared with what is yet to be known, the biological insights that can be made from the existing data are limited. This is equally true of the common and rare variants identified so far, and we are not aware of any of the "incisive findings" that Porteous claims have already come from alternative approaches, although the emergence of strong evidence for deletions at NRXN1 as a susceptibility variant for schizophrenia through meta-analysis of case-control GWAS data (one of the extra returns on the GWAS data we referred to above) deserves that description (Kirov et al., 2009). But this is not a cause for despair; in contrast to the future promises made on behalf of other as yet unproven designs, for eyes and minds that are open enough to see, the recent papers provide unambiguous evidence for a straightforward route to identifying more genes and pathways involved in the disorder. Even Porteous has partial sight of this, since he notes that "there is clearly a wealth of potentially valuable information lying below the surface of the most statistically significant findings." What he appears unable to see is "how to sort the true from the false associations?" The answer for a large number of loci is simple. Better-powered studies based upon larger sample sizes.

We would like to add a note of caution for those who too readily denigrate case-control approaches in favor of hyping other approaches, none of which are yet so well proven routes to success. We are not against those approaches; indeed, we are actively involved in them. But we are concerned that the hype surrounding sequencing, and the generation of what we think are unrealistic expectations, will make those designs vulnerable to attack from those who seem only too keen to make premature and inaccurate pronouncements of failure, who seem desperate to derive straw from nuggets of gold. If, as we believe is likely, it turns out to be quite a few years more before sequencing studies become sufficiently powered to provide large numbers of robust findings, as for GWAS, the consequence could be withdrawal of substantial government funding before those designs have had a chance to live up to their potential. That such an outcome has already largely been achieved for GWAS in some countries might be a source of rejoicing in some quarters, but it should also send out a warning to all who broadly hold the view that understanding the genetics of these disorders is central to understanding their origins, and to improving their future management.

The recent PGC papers represent an impressive, international collaboration based upon methodologies that have a proven track record in delivering important biological insights into other complex disorders, and now in psychiatry. Given the complexity of psychiatric phenotypes, we believe it is likely that a variety of approaches, paradigms, and ideas will be essential for success, including the approaches espoused by those who believe the evidence is compatible with essentially Mendelian inheritance. Inevitably, there will be sincerely held differences of opinion concerning the best way forward, and, of course, in any area of science, reasoned arguments based upon a fair assessment of the evidence are essential. Nevertheless, given there are sufficient uncertainties about what can be realistically delivered in the short term by the newer technologies, we suggest that the cause of bringing benefit to patients will most likely be better served by humility, realism, and a constructive discussion in which there is no place for belittling real achievements, for arrogance, or for dogmatic posturing.

References

Blackwood DH, Fordyce A, Walker MT, St Clair DM, Porteous DJ, Muir WJ. Schizophrenia and affective disorders--cosegregation with a translocation at chromosome 1q42 that directly disrupts brain-expressed genes: clinical and P300 findings in a family. Am J Hum Genet. 2001 Aug;69(2):428-33. Abstract

International Schizophrenia Consortium Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009 Aug 6;460(7256):748-52. Abstract

Jones L, Holmans PA, Hamshere ML, Harold D, Moskvina V, Ivanov D, et al. Genetic evidence implicates the immune system and cholesterol metabolism in the etiology of Alzheimer's disease. PLoS One. 2010 Nov 15;5(11):e13950. Erratum in: PLoS One. 2011;6(2). Abstract

Kirov G, Rujescu D, Ingason A, Collier DA, O'Donovan MC, Owen MJ. Neurexin 1 (NRXN1) deletions in schizophrenia. Schizophr Bull. 2009 Sep;35(5):851-4. Epub 2009 Aug 12. Review. Abstract

Lango Allen H, Estrada K, Lettre G, Berndt SI, Weedon MN, Rivadeneira F, et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature. 2010 Oct 14;467(7317):832-8. Abstract

Levinson DF, Duan J, Oh S, Wang K, Sanders AR, Shi J, et al. Copy number variants in schizophrenia: confirmation of five previous findings and new evidence for 3q29 microdeletions and VIPR2 duplications. Am J Psychiatry. 2011 Mar;168(3):302-16. Abstract

McGuffin P, Farmer AE, Rajah SM. Histocompatability antigens and schizophrenia. Br J Psychiatry. 1978 Feb;132:149-51. Abstract

Millar JK, Wilson-Annan JC, Anderson S, Christie S, Taylor MS, Semple CA, et al. Disruption of two novel genes by a translocation co-segregating with schizophrenia. Hum Mol Genet. 2000 May 22;9(9):1415-23. Abstract

Shi J, Levinson DF, Duan J, Sanders AR, Zheng Y, Pe'er I, et al. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature. 2009 Aug 6;460(7256):753-7. Abstract

St Clair D, Blackwood D, Muir W, Carothers A, Walker M, Spowart G, et al. Association within a family of a balanced autosomal translocation with major mental illness. Lancet. 1990 Jul 7;336(8706):13-6. Abstract

Stefansson H, Ophoff RA, Steinberg S, Andreassen OA, Cichon S, Rujescu D, et al Common variants conferring risk of schizophrenia. Nature. 2009 Aug 6;460(7256):744-7. Abstract

The Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium. Genome-wide association study identifies five new schizophrenia loci. Nat Genet. 2011 Sep 18;43(10):969-976. Abstract

Williams HJ, Norton N, Dwyer S, Moskvina V, Nikolov I, Carroll L, et al. Fine mapping of ZNF804A and genome-wide significant evidence for its involvement in schizophrenia and bipolar disorder. Mol Psychiatry. 2011 Apr;16(4):429-41. Abstract

View all comments by Nick Craddock
View all comments by Michael O'Donovan

Related News: GWAS Goes Bigger: Large Sample Sizes Uncover New Risk Loci, Additional Overlap in Schizophrenia and Bipolar Disorder

Comment by:  Todd LenczAnil Malhotra (SRF Advisor)
Submitted 11 October 2011
Posted 11 October 2011

It is worth re-emphasizing that efforts such as the Psychiatric GWAS Consortium do not rule out potentially important discoveries from alternative strategies such as endophenotypic approaches or examination of rare variants. Indeed, such strategies will be necessary to understand the functional mechanisms implicated by GWAS hits.

Moreover, we note that the two recently published PGC papers were not designed to exclude a role for previously identified candidate loci such as DISC1 (Hodgkinson et al., 2004), or prior GWAS findings such as rs1344706 at ZNF804A (Williams et al., 2011). For both these loci, and many others that have been proposed, meta-analysis of available samples suggest very small effect sizes (OR ~1.1), as might be expected for common variants. As noted in Supplementary Table S12 of the schizophrenia PGC paper (Ripke et al., 2011), the currently available sample size (~9,000 cases/~12,000 controls) of the discovery cohort was still underpowered to detect variants with odds ratios of 1.1, especially if they have a minor allele frequency of 20 percent or below.

An instructive example arises from the field of diabetes genetics. An association of a missense variant (rs1801282, Pro12Ala) in PPARG to type 2 diabetes was first reported in a sample of n = 91 Japanese-American patients (Deeb et al., 1998). Many subsequent studies failed to replicate the effect, and the initial large GWAS meta-analysis (involving >14,000 cases and ~18,000 controls; Zeggini et al., 2007) only detected the association at a p-value that would be considered non-significant by todays standard (p =1.7*10-6). Interestingly, the authors deemed the association to be confirmed, and the result was widely accepted within that field. Subsequent meta-analysis, involving twice as many subjects (total n = 67,000), finally obtained conventional genomewide levels of significance (p <5*10-8; Gouda et al., 2010).

References:

Deeb SS, Fajas L, Nemoto M, Pihlajamki J, Mykknen L, Kuusisto J, Laakso M, Fujimoto W, Auwerx J. A Pro12Ala substitution in PPARgamma2 associated with decreased receptor activity, lower body mass index and improved insulin sensitivity. Nat Genet. 1998 Nov;20(3):284-7. Abstract

Gouda HN, Sagoo GS, Harding AH, Yates J, Sandhu MS, Higgins JP. The association between the peroxisome proliferator-activated receptor-gamma2 (PPARG2) Pro12Ala gene variant and type 2 diabetes mellitus: a HuGE review and meta-analysis. Am J Epidemiol. 2010 Mar 15;171(6):645-55. Abstract

Hodgkinson CA, Goldman D, Jaeger J, Persaud S, Kane JM, Lipsky RH, Malhotra AK. Disrupted in schizophrenia 1 (DISC1): association with schizophrenia, schizoaffective disorder, and bipolar disorder. Am J Hum Genet. 2004 Nov;75(5):862-72. Abstract

Williams HJ, Norton N, Dwyer S, Moskvina V, Nikolov I, Carroll L, Georgieva L, Williams NM, Morris DW, Quinn EM, Giegling I, Ikeda M, Wood J, Lencz T, Hultman C, Lichtenstein P, Thiselton D, Maher BS; Molecular Genetics of Schizophrenia Collaboration (MGS) International Schizophrenia Consortium (ISC), SGENE-plus, GROUP, Malhotra AK, Riley B, Kendler KS, Gill M, Sullivan P, Sklar P, Purcell S, Nimgaonkar VL, Kirov G, Holmans P, Corvin A, Rujescu D, Craddock N, Owen MJ, O'Donovan MC. Fine mapping of ZNF804A and genome-wide significant evidence for its involvement in schizophrenia and bipolar disorder. Mol Psychiatry. 2011 Apr;16(4):429-41. Abstract

Zeggini E, Weedon MN, Lindgren CM, Frayling TM, Elliott KS, Lango H, Timpson NJ, Perry JR, Rayner NW, Freathy RM, Barrett JC, Shields B, Morris AP, Ellard S, Groves CJ, Harries LW, Marchini JL, Owen KR, Knight B, Cardon LR, Walker M, Hitman GA, Morris AD, Doney AS; Wellcome Trust Case Control Consortium (WTCCC), McCarthy MI, Hattersley AT. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science. 2007 Jun 1;316(5829):1336-41. Abstract

View all comments by Todd Lencz
View all comments by Anil Malhotra

Related News: Study Claiming Eight Types of Schizophrenia Called Into Question

Comment by:  Michael O'Donovan, SRF Advisor
Submitted 3 October 2014
Posted 3 October 2014

Comment by Michael O'Donovan, Gerome Breen, Brendan Bulik-Sullivan, Mark Daly, Sarah Medland, Benjamin Neale, Stephan Ripke, Patrick Sullivan, Peter Visscher, Naomi Wray

[Editor's note: Reprinted from PubMed Commons, without changes, under the Creative Commons attribution 3.0 license.]

In this study published on September 15, Arnedo et al. asserted that schizophrenia is a heterogeneous group of disorders underpinned by different genetic networks mapping to differing sets of clinical symptoms. As a result of their analyses, Arnedo et al. have made remarkable and perhaps unprecedented claims regarding their capacity to subtype schizophrenia. This paper has received considerable media attention. One claim features in many media reports, that schizophrenia can be delineated into "8 types". If these claims are replicable and consistent, then the work reported in this paper would constitute an important advance into our knowledge of the etiology of schizophrenia.

Unfortunately, these extraordinary claims are not justified by the data and analyses presented. Their claims are based upon complex (and we believe flawed) analyses that are said to reveal links between clusters of clinical data points and patterns of data generated by looking at millions of genetic data points. Instead of the complexities favored by Arnedo et al., there are far simpler alternative explanations for the patterns they observed. We believe that the authors have not excluded important alternative explanations – if we are correct, then the major conclusions of this paper are invalidated.

Analyses such as these rely on independence in many ways: among variables used in prediction, absence of artifactual relationships between genotypes and clinical variables, and between the methods of assessing significance and replication. Below we identify five specific areas of concern that are not adequately addressed in the manuscript, each of which calls into question the conclusions of this study.

A. Ancestry/population stratification.
Two of the three samples the authors studied (MGS and CATIE) have substantial proportions of subjects of European and African ancestry. The third sample is from southern Europe. Ancestry is an extremely well known confounder in genetic studies with a great capacity to yield false associations. Correct inference from genomic data in samples like these requires exceptional care. In the analyses they present, there is almost no mention of how this known bias was addressed or evaluation of its impact on their results. In the samples they used, their references to sets of SNPs that track together is essentially the definition of uncorrected population structure/stratification. Indeed, a central component of their statistical methodology – nonnegative matrix factorization – has been previously employed as a method for ancestry inference in the population genetics literature.

We were unsuccessful in attempts to obtain the full list of SNPs that Arnedo et al. analyzed. Instead, we evaluated the SNPs listed in Table S3 (448 SNP entries, 245 unique SNPs as SNPs could be present more than once, and 237 SNPs with valid allele frequencies in HapMap3). We computed the absolute value of the difference in allele frequencies between the CEU (northwest European) and YRI (Yorubas from Nigeria) groups for all HapMap3 SNPs passing basic quality control (688K SNPs genotyped using Affymetrix 6.0 arrays to match the MGS sample). We then contrasted the SNPs used by Arnedo et al. with all other affy6 SNPs. The Table S3 SNPs had markedly larger differences between a European and an African group. The mean for the absolute difference in allele frequency was 0.27 for the Table S3 SNPs used by Arnedo et al. versus 0.19 for all other SNPs. These highly significant differences underscore our concerns about population stratification bias.

B. X chromosome (chrX).
We noted that 15 of 237 of the SNPs in Table S3 were on chrX (again, Table S3 contains a fraction of the SNPs used in the modeling). Inclusion of chrX SNPs will partly reflect the sex of participants. Arnedo et al. say in their supplement that they include sex as a covariate in their regressions, but they do not describe how they account for sex in their matrix factorization. For example, since males have only one copy of chrX, genotypes for males will be either 0 or 1 whereas chrX genotypes for females will be either 0, 1 or 2. This difference will be salient to clustering algorithms such as those employed by the authors, so it seems likely that some component of the clusters of individuals identified by Arnedo et al. simply reflect genotype differences between sexes rather than clinical features of schizophrenia. It is well-known in statistical genetics that the sex chromosomes require special handling, but this issue is not addressed by Arnedo et al.

C. Linkage disequilibrium (LD).
Pairs of SNPs that are physically close in the genome are often correlated due to LD. Furthermore, in samples containing individuals with different ancestry, SNPs on different chromosomes whose allele frequencies differ between populations will appear to be correlated. These are both well-known phenomenon from population genetics.

The typical size of blocks defined by high LD is on the order of 20,000 bases, but LD is far from uniform across the genome. Using a large European sample genotyped with Affymetrix 6.0 arrays, we had previously computed the locations of particularly large blocks of LD (defined using SNPs with r2 > 0.5). The first step in the statistical methodology described by Arnedo et al. is to identify so-called "SNP sets" – sets of SNPs that travel together – which the authors believe contain some information about clinical subtypes of schizophrenia: "we first identified sets of interacting … SNPs that cluster within subgroups of individuals … regardless of clinical status" (no LD limitations were imposed). Of the 237 SNPs in Table S3 from Arnedo et al., 153 (65%) mapped to exceptionally large LD blocks larger than 100,000 bases (median 275kb, interquartile range 165-653kb, maximum 1.2 mb).

Arnedo et al. claim repeatedly that sets of SNPs that travel together are informative about clinical subtypes of schizophrenia. A more parsimonious interpretation of the SNP clusters identified by Arnedo et al. is that these SNPs represent a combination of (1) SNPs in large LD blocks and (2) SNPs whose allele frequencies differ substantially between European and African sample subsets. Indeed, matrix factorization algorithms similar to the methods employed by Arnedo et al. have been used to identify regions with long-range LD.

D. SNP selection.
Arnedo et al. conducted genetic clustering analyses on 2,891 SNPs selected on the basis of in-sample P-values from analysis of association with case-control status and selected from a total of ~700,000 SNPs. It is therefore expected that linear or non-linear combinations of these SNPs will be associated with case-control status in the same sample (their risk statistic); this is true even if the selected SNPs are not truly associated. A permutation test is used to assess the significance of the observed phenotype/genotype clustering. In this permutation test, subjects are randomly allocated to "SNP sets" but, since the SNPs were selected because they differ in allele frequency between cases and controls, this procedure does not generate a valid null distribution. As a result, the reported P-values are incorrect.

The strategy used by Arnedo etl al. is an example of estimation and selection of effects in a dataset and then testing (or re-estimating) them in the same data, a common pitfall of prediction analyses. To construct a valid permutation test, the authors should have randomized case-control status in the association analysis step, selected a new set of ~3,000 SNPs and generated a distribution of their coincident test index under a truly null distribution.

E. Replication.
Replication of results is a well-acknowledged strategy for generating confidence in reported findings. Arnedo et al. state that they replicated their findings in two samples but, upon closer examination, it is unclear precisely what replicated, exactly how this was done, and whether the degree of "replication" deviated from that expected by chance. It was also unclear whether the replication control samples were or were not independent from the discovery sample. Such non-independence is another common pitfall in prediction or validation analysis.

Conclusions.
Given the remarkable claims made by Arnedo et al., it is essential that alternative explanations be excluded. Unfortunately, the authors do not provide the necessary evidence. As presented, their methodology is opaque (even to experts), meaning that their results cannot be independently validated. Arnedo et al. do not consider alternative explanations for the phenomena that they observe, such as confounding from ancestry and LD, even though these are well-known issues for the statistical methods that they employ and have been studied extensively in the statistical and population genetics literature. In addition, their multistep analysis approach is subject to multiple issues as noted above.

We believe that it is highly likely that the results of Arnedo et al. are not relevant for schizophrenia. We urge great caution in the interpretation of the results of study.

References:

Press release from Washington University (St Louis)

Media coverage via Google News

Nonnegative matrix factorization and ancestry inference

Pitfalls of predicting complex traits from SNPs

Using principal components analysis to identify regions with long-range LD

View all comments by Michael O'Donovan

Related News: Study Claiming Eight Types of Schizophrenia Called Into Question

Comment by:  Alexander B. Niculescu
Submitted 6 October 2014
Posted 6 October 2014

Schizophrenia Subtypes: (Some) Right Ideas, (Some) Fuzzy Execution
The recent paper by Arnedo et al. (Arnedo et al., 2014) on "uncovering the hidden risk architecture of the schizophrenias" has three main ideas: 1) empirical discovery of groups of SNPs clustering with groups of schizophrenia subjects; 2) empirical discovery of groups of clinical features (what I have called in the past "phenes"; see Niculescu et al., 2006) clustering with groups of schizophrenia subjects; and 3) trying to put it all together (similar to the PhenoChipping approach put forward by myself and others in the past; see Niculescu et al., 2006).

The fact that groups of SNPs working together in networks can account for the missing heritability is not a new idea. It has been proposed before, as epistasis (Pezawas et al., 2008; Nicodemus et al., 2010) or as more complex combinatoric models integrating the environment (Patel et al., 2010; Ayalew et al., 2012). To people working in the gene expression field, which is closer to biology, it has been a given for many years, from the operon of Jacob and Monod (Jacob et al., 2005) to co-acting gene expression groups (CAGE; Niculescu et al., 2000) or co-expression networks (Zhang and Horvath, 2005; de Jong et al., 2012).

The devil is in the details of the execution, made difficult to judge by some lack of transparency about methodology and how independent the testing cohorts were. From more minor but more obvious caveats, such as SNPs being potentially in LD or potential population stratification, to more major but less obvious caveats, such as that this type of clustering will give you a fit-to-cohort effect that is dependent on the subjects used and the quality of the clinical information available on the subjects (often cursory in the large cohorts used for GWAS), things start to become fuzzy. All in all, it is too early to draw conclusions about how many subtypes of schizophrenia there are.

There are ways to mend this. First, it would be good to see converging lines of evidence scoring such as convergent functional genomics (CFG) used to prioritize SNPs and their associated genes for fit-to-disease first, prior to the clustering, as a way of preventing a fit-to-cohort effect (Niculescu and Le-Niculescu, 2010). Second, the reproducibility in completely independent, non-overlapping cohorts, of the locked panels of markers or "pheno-geno" subtypes, needs to be demonstrated unambiguously, such as was done by others in the past (Ayalew et al., 2012). Third, it is likely that schizophrenia is just one dimension of pathology, albeit a main one, in schizophrenia subjects. Combining also the dimensions of mood and anxiety will provide a better description of the clinical mental landscape (co-morbidities) (Niculescu et al., 2010) present, in fact, in these subjects and may account for some of the "missing reproducibility."

References:

Arnedo J, Svrakic DM, Del Val C, Romero-Zaliz R, Hernandez-Cuervo H,, Fanous AH, Pato MT, Pato CN, de Erausquin GA, Cloninger CR, Zwir I. Uncovering the Hidden Risk Architecture of the Schizophrenias: Confirmation in Three Independent Genome-Wide Association Studies. Am J Psychiatry. 2014 Sep 15. Abstract

Niculescu AB, Lulow LL, Ogden CA, Le-Niculescu H, Salomon DR, Schork NJ, Caligiuri MP, Lohr JB. PhenoChipping of psychotic disorders: a novel approach for deconstructing and quantitating psychiatric phenotypes. Am J Med Genet B Neuropsychiatr Genet. 2006 Sep 5; 141B(6):653-62. Abstract

Pezawas L, Meyer-Lindenberg A, Goldman AL, Verchinski BA, Chen G, Kolachana BS, Egan MF, Mattay VS, Hariri AR, Weinberger DR. Evidence of biologic epistasis between BDNF and SLC6A4 and implications for depression. Mol Psychiatry. 2008 Jul; 13(7):709-16. Abstract

Nicodemus KK, Callicott JH, Higier RG, Luna A, Nixon DC, Lipska BK, Vakkalanka R, Giegling I, Rujescu D, St Clair D, Muglia P, Shugart YY, Weinberger DR. Evidence of statistical epistasis between DISC1, CIT and NDEL1 impacting risk for schizophrenia: biological validation with functional neuroimaging. Hum Genet. 2010 Apr; 127(4):441-52. Abstract

Patel SD, Le-Niculescu H, Koller DL, Green SD, Lahiri DK, McMahon FJ, Nurnberger JI, Niculescu AB. Coming to grips with complex disorders: genetic risk prediction in bipolar disorder using panels of genes identified through convergent functional genomics. Am J Med Genet B Neuropsychiatr Genet. 2010 Jun 5; 153B(4):850-77. Abstract

Ayalew M, Le-Niculescu H, Levey DF, Jain N, Changala B, Patel SD, Winiger E, Breier A, Shekhar A, Amdur R, Koller D, Nurnberger JI, Corvin A, Geyer M, Tsuang MT, Salomon D, Schork NJ, Fanous AH, O'Donovan MC, Niculescu AB. Convergent functional genomics of schizophrenia: from comprehensive understanding to genetic risk prediction. Mol Psychiatry. 2012 Sep; 17(9):887-905. Abstract

Jacob F, Perrin D, Sánchez C, Monod J, Edelstein S. [The operon: a group of genes with expression coordinated by an operator. C.R.Acad. Sci. Paris 250 (1960) 1727-1729]. C R Biol. 2005 Jun; 328(6):514-20. Abstract

Niculescu AB, Segal DS, Kuczenski R, Barrett T, Hauger RL, Kelsoe JR. Identifying a series of candidate genes for mania and psychosis: a convergent functional genomics approach. Physiol Genomics. 2000 Nov 9; 4(1):83-91. Abstract

Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005; 4():Article17. Abstract

de Jong S, Boks MP, Fuller TF, Strengman E, Janson E, de Kovel CG, Ori AP, Vi N, Mulder F, Blom JD, Glenthøj B, Schubart CD, Cahn W, Kahn RS, Horvath S, Ophoff RA. A gene co-expression network in whole blood of schizophrenia patients is independent of antipsychotic-use and enriched for brain-expressed genes. PLoS One. 2012; 7(6):e39498. Abstract

Niculescu, A.B. & Le-Niculescu, H. The P-value illusion: how to improve (psychiatric) genetic studies. American journal of medical genetics. Part B, Neuropsychiatric genetics: the official publication of the International Society of Psychiatric Genetics 153B, 847-849 (2010). Abstract

Niculescu AB, Le-Niculescu H. The P-value illusion: how to improve (psychiatric) genetic studies. Am J Med Genet B Neuropsychiatr Genet. 2010 Jun 5; 153B(4):847-9. Abstract

View all comments by Alexander B. Niculescu

Related News: Study Claiming Eight Types of Schizophrenia Called Into Question

Comment by:  Hakon Heimer
Submitted 8 October 2014
Posted 8 October 2014

[Editor's note: The discussion on this paper continues apace as of October 7, with replies to the critics from several of the authors of the original report by Arnedo et al. at PubMed Commons. They have submitted their original reply to SRF as well (below), and for the remaining replies and any future comments, we direct you to the discussion at PubMed.]

View all comments by Hakon Heimer

Related News: Study Claiming Eight Types of Schizophrenia Called Into Question

Comment by:  Gabriel de Erausquin
Submitted 9 October 2014
Posted 9 October 2014
  I recommend the Primary Papers

On behalf of: C. Robert Cloninger, MD, PhD (Departments of Psychiatry and Genetics, Washington University School of Medicine, St. Louis, MO, USA); Igor Zwir, PhD (Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA; Department of Computer Science and Artificial Intelligence, University of Granada, Spain); Gabriel A. de Erausquin, MD, PhD (Roskamp Laboratory of Brain Development, Modulation and Repair, Department of Psychiatry and Behavioral Neurosciences, University of South Florida, Tampa, FL, USA); Dragan M. Svrakic, MD, PhD (Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA); Coral del Val, PhD (Department of Computer Science and Artificial Intelligence, University of Granada, Spain); Javier Arnedo, M.S. (Department of Computer Science and Artificial Intelligence, University of Granada, Spain); Rocio Romero-Zaliz, PhD (Department of Computer Science and Artificial Intelligence, University of Granada, Spain); Helena Hernandez-Cuervo, MD, BSc (Roskamp Laboratory of Brain Development, Modulation and Repair, Department of Psychiatry and Behavioral Neurosciences, University of South Florida, Tampa, FL, USA)

Two Distinct Perspectives and Methodological Approaches to GWAS
We expected our paper uncovering the hidden risk architecture of the schizophrenias to be controversial because it takes a fundamentally new approach to solve problems that have plagued the field of medical genetics for more than a decade without resolution (Arnedo et al., 2014). We went through rigorous peer review regarding the method with experts in bioinformatics and genetics (Arnedo et al., 2013) and then again regarding the application of our new approach to the schizophrenias (Arnedo et al., 2014). The critical comments of Breen and other colleagues of Sullivan highlight the fact that we have a fundamentally new approach with a distinct perspective and properties from the traditional method they have used for several years. It is important to understand how our novel approach differs from the traditional one in order to appreciate the opportunities it provides for the advancement of science.

First, in our novel approach, common disorders are recognized to have a complex etiology in which multiple genetic and environmental variables interact in complex ways to influence the risk of disease in an individual person. Breen and commentators are experienced in approaches to genome-wide association studies (GWAS) that allow detection of only the average (additive) effects of individual genes in groups of people. We regard the traditional group-wise approach to GWAS as overly restrictive because it is well established that genes typically function in concert with one another, resulting in substantial epistasis in schizophrenia and many other common disorders (Risch, 1990). Fitness, health, and behavior are properties of persons, not genes. Nevertheless, the traditional approach can be useful when its a priori assumptions are satisfied. The traditional and novel approaches to GWAS should be viewed as being complementary perspectives and procedures.

Second, our novel approach allows for the possibility of complex relationships between multilocus genotypes and multifaceted phenotypes. In other words, different sets of genetic polymorphisms can be associated with the same phenotype ("equi-finality" or genetic heterogeneity), and the same set of genetic polymorphisms can be associated with multiple distinct phenotypes ("multifinality" or pleiotropy). They focus only on heterogeneous groups of cases, neglecting phenotypic variability among cases. It is important to note that our novel approach does not make any a priori assumption that complexity is present, but we do allow it to emerge from the data when present, as occurred clearly in our analysis of multiple independent samples of people with the schizophrenias.

Third, we carry out person-centered analyses that specify genotypic-phenotypic relationships within each individual by using clustering methods in which subjects are one matrix dimension and the other matrix dimension is either genotypic or phenotypic information. In other words, our analyses are informative about each individual, thereby providing a basis for identifying specific causes of illness in each person as a basis for tailoring treatment in a personalized way. In contrast, the traditional GWAS considers only average effects in groups of people, making it unjustified to say anything with confidence about a specific individual. Such traditional methods have failed to produce any reliable genetic test for the diagnosis of any psychiatric disorder in an individual person. In addition, phenomena such as population stratification and linkage disequilibrium may confound interpretation of the group-wise statistics of traditional GWAS, whereas these phenomena are easily evaluated in our person-centered approach.

Fourth, our approach is entirely data-driven using machine learning and data mining procedures that are unbiased and unsupervised (i.e., no a priori assumptions are made). Such data-driven methods have been used successfully in many fields of science but not for GWAS prior to our work. In contrast, the a priori assumptions made in the traditional GWAS approach, as used by Breen and commentators, have produced only weak associations between the average effects of individual genes and the diagnosis of schizophrenia. In fact, Sullivan and others proposed the formation of the Psychiatric Genetics Consortium (PGC) to address the problem of weak and inconsistent associations. Unfortunately, even large samples still produce only weak associations (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014).In addition, even large collections of subjects have encountered what is called the "missing heritability problem" in medical genetics: most of the variability in risk for the schizophrenias has remained unexplained. For example, the resemblance of monozygotic co-twins of people with schizophrenia is much greater than can be explained by the average effects of individual genes, indicating that multiple genes act in concert to influence risk (Risch, 1990). Whereas the heritability of schizophrenia is estimated to be 81 percent from twin studies, only about 25 percent of the variability has been explained by traditional GWAS.

In contrast, by applying our novel approach to GWAS we observed that sets of single nucleotide polymorphisms (SNPs) allow identification of individuals at very high risk (70 percent or more)and replicated the findings consistently in three independent samples. Our results can explain much more about disease risk than traditional group-wise approaches, so we are not surprised that such strong findings may come as a shock to those who have become accustomed to the weak associations identified by traditional GWAS. Again, we expected that this would be very controversial, but we're optimistic that this fundamentally new approach will open up many new opportunities for people interested in medical genetics. Our results were so unexpected that peer-reviewers demanded replication in independent samples before acceptance. Even we were delighted with the strong replication: 81 percent of 42 SNP sets associated with 70 to 100 percent risk of schizophrenia replicated almost exactly across three independent samples, and different SNP sets were associated with distinct phenotypic syndromes in which the gene products suggest possible pathways by which the functions and expression of the genes in the brain may explain the different clinical features of individual patients.

In summary, traditional approaches to GWAS focus on the average effects of individual genes in groups of people, whereas our novel approach focuses on the interactive effects of groups of genes in an individual person. Consequently the differences between these two approaches have profound consequences for the way they view and handle phenomena like linkage disequilibrium, population stratification, and X linkage. Unfortunately, Breen and colleagues have not adequately appreciated the profound differences between the traditional methods of GWAS with which they are familiar and the novel approach we have developed. As a result, the criticisms they made reflect their concerns about problems that regularly occur when a traditional group-wise approach is implemented, but these concerns may have minimal pertinence to our person-centered approach and findings.

The Facts About Ancestry and Population Stratification
Breen and commentators expressed concern that our findings may be an artifact. As scientists we are committed to trying to disconfirm findings we have made, no matter how strong the existing evidence may be. Findings about association need to be resolved experimentally at the molecular level not only for our findings but also for the findings of other published GWAS, which we plan to do in the near future. We have previously considered the variables that concern Breen and commentators carefully, but did not report these observations in detail for two reasons. First, their impact was empirically negligible for our approach, as we will describe, and we prioritized space to the variables that were most significant. Second, although the phenomena that concern Breen and commentators are often a serious problem for traditional group-wise approaches to GWAS, they are not problematic for our novel approach because it directly tests the association between genotypic variability and phenotypic variability within individuals after deconstruction of the observed or hidden structure of the population. We will describe the facts about each of these phenomena and explain why these criticisms fall short of explaining our findings. Breen and commentators expressed concern that we did not take the necessary steps to correct for population stratification bias in the way they would have done using their traditional group-wise statistical approach. They suggest that the clusters we identified simply reflect "SNPs whose allele frequencies differ substantially between European and African sample subsets," and go so far as to claim that the SNP sets we uncovered may not be relevant to schizophrenia at all. However, for that claim to be valid, ethnicity must have a strong influence on the risk and symptoms of schizophrenia, but that requirement is unlikely to be satisfied based on previous observations and was not found in our data. We did consider both sex and ancestry as covariates in the pre-selection of SNPs with at least loose association with schizophrenia. This pre-selection was performed to reduce the large search space using the logistic association function included in the PLINK software suite (Purcell et al., 2007). Our analysis was performed in this way to be compatible with the supplementary tables reported in (Shi et al., 2009) for African Americans (AA), European-Americans (EA), and individuals of mixed African and European ancestry (AA-EA). The most important fact about ethnic stratification is that there were multiple examples of SNP sets containing varying mixes of subjects from different subpopulations for each disease subtype in each of our three independent samples of subjects. For example, in the Molecular Genetics of Schizophrenia (MGS) sample, the SNP set 22_11 was represented by 48 percent AA and 52 percent EA,SNP set 21_8 by 55 percent AA and 45 percent EA, SNP set 31_22 by 53 percent AA and 47 percent EA, SNP set 54_51 by 79 percent AA and 21 percent EA, and SNP set 71_55 by 52 percent AA and 48 percent EA.

In fact, all the SNP sets that appeared to be ethnically stratified (i.e., contained mostly AA or EA subjects in MGS sample, such as 56_30 or 42_37) replicated their association with specific phenotypic indicators of different classes of schizophrenia in subjects of another ethnicity in CATIE or in the Portuguese sample. Although concerns about ethnic stratification may be valid elsewhere, ethnic stratification had little impact on our results and cannot explain the robust association of specific SNP sets with specific phenotypic sets regardless of ethnicity or sample. These observations show the great utility of detailed consideration of phenotypic variability in individual people in our approach, compared to the sensitivity to confounding by population stratification in traditional GWAS when heterogeneous phenotypes are lumped together indiscriminately as cases. The concerns of Breen and commentators about ethnic stratification point out a limitation of traditional group-wise GWAS that is averted by our novel approach. We thank them for drawing attention to another strength of our approach, one that we did not have enough space to report previously. We will discuss population stratification in more detail together with our reply about linkage disequilibrium (see posting 5), and discuss significance testing following our comments on replication in later sections of our reply to Part 2 of their comments (see posting 6).

The Facts About Gender and the X Chromosome
Breen and commentators express concern about gender effects in our results. Traditional GWAS focuses on average effects in heterogeneous groups, but our novel approach focuses on uncovering genotypic-phenotypic relationships in individuals regardless of their gender. Breen and commentators were concerned about the possible bias of results from 15 SNPs on the X chromosome among 245 SNPs in high-risk SNP sets. However, a simple test based on the number of chromosomes shows that 15 SNPs cannot substantially confound the results. It is true that three of our 42 high-risk SNP sets have some SNPs on the X Chromosome, but when this is considered in context along with the remaining 39 SNP sets, the influence of gender was insignificant by a Kolmogorov test. All SNP sets have consistent associations with distinct phenotypic sets regardless of gender. The effects of gender and location of genetic variants on the X chromosome have a negligible influence on our findings. In fact, the small number of SNPs on the X chromosome in SNP sets at high risk for schizophrenia shows that our person-centered method does not select SNPs that are in LD indiscriminately; the X chromosome has many highly conserved sets of epistatic genes in LD that influence gender and brain function (Graves, 2010), but these are not overrepresented in our SNP sets at high risk for schizophrenia. We thank Breen and commentators for calling attention to another finding that demonstrates that our method for identifying SNP sets is highly selective for particular phenotypes.

The Challenge of Understanding and Accepting a Change in Perspective
Is our approach to concurrent genotypic-phenotypic of possible complex relationships a fruitful new approach without the limiting assumptions of standard GWAS, or are our observations really artifactual in ways that have been overlooked by us and by multiple sets of peer reviewers with relevant expertise about these novel methods? The American Journal of Psychiatry gave us generous space for the published article, including clinical vignettes with associated genotypic information to help people see what we have done even if the technical details of the statistical procedures may seem obscure when you first start looking at complex genotype-phenotype relationships through the illuminating lens of sophisticated machine-learning and data mining procedures. We expected that there would be widespread interest and scrutiny of this new data-driven approach with less restrictive assumptions, so we prepared an extensive online supplement specifying procedures, all components of the sets of SNPs and clinical variables as well as a detailed analysis of the associated gene products, their functions, and disease associations.

The full list of SNPs used in our analysis is being made available for others to continue to test. We believe in transparency and collegiality as key ingredients in the advance of science because it is essential for the spirit of empiricism that our data driven method emphasizes. The precise procedures for reproducing the list of SNPs was detailed already in our supplemental information and should be reproducible by experienced investigators. We will continue to consider reasonable requests for assistance from qualified investigators.

Breen and commentators expressed their concern that the "methodology is opaque (even to experts), meaning that their results cannot be independently validated." First of all, the complexity of a method does not invalidate the approach: complex methods may be necessary to deconstruct and understand complex processes. We cannot continue to look for hidden relationships with methods that do not shine light where it is needed. That said, the manuscript was exhaustively evaluated under strict peer review process, which included a separate report from an independent statistician. Because of their many insightful comments, there is no doubt that the referees understood the method and provided recommendations that we conscientiously addressed in the resubmission process. Moreover, the PGMRA method utilized in this work was also evaluated by expert reviewers in bioinformatics and genetics for the journal Nucleic Acid Research (Arnedo et al., 2013). The method is well-described but does require relevant expertise beyond what is required for traditional GWAS. Fortunately, we have made a web-server application of the method publicly available as a service to the field. PGMRA is applicable to a wide variety of analyses besides GWAS, including brain imaging and related methods for uncovering order in complex hidden relationships, which may help to further characterize the pathway from genotype to phenotype more objectively than can be done by categorical diagnoses or symptom inventories in samples so large that costs become prohibitive for thorough assessment. We know that many are increasingly criticizing overspecialization in the fields of science, but the neglect of strictly data-driven techniques from machine-learning and data-mining that do not require restrictive a priori assumptions may well be precisely what has prevented us from understanding the complex genotypic-phenotypic architecture of common disorders like the schizophrenias.

We understand that our new approach is challenging long-held assumptions and that there may be a desire by some to put the genie back in the bottle, but we feel that looking at the complexity of the schizophrenias is a necessary evolution for the field; it is an evolution whose time has come and is currently transforming other fields of science and genetics. There is overwhelming evidence across multiple disciplines that living systems and psychosocial behavior are simply too complex and interactive to ignore the real underlying complexity. Nonetheless, we were a bit surprised to see this discussion in a public forum that is not peer reviewed. We would rather have thoughtful constructive consideration of the scientific merits of alternative approaches, including their fundamental philosophical differences in perspective and goals, as well as scientific differences in assumptions and procedures. One of the major obstacles to evaluating GWAS is that it can be difficult or impossible for scientists in many fields to evaluate complex technical procedures with which they are unfamiliar. The challenge of changing one's perspective can be great and feel counterintuitive, as physicists experienced more than a century ago when quantum mechanics called into question our more natural inclination to a Newtonian perspective. That is why we feel it would have been more constructive to have neutral review by people with relevant expertise in many aspects of methods that span bioinformatics, statistics, genomics, and phenomics, all of which are needed to adequately judge the strengths and weaknesses of a novel approach like ours. Even people with extensive experience in traditional approaches to GWAS may not be sufficiently knowledgeable about these well-tested, but relatively new, machine-learning and data-mining techniques that have allowed us to develop a new, and, we hope, more generative approach to GWAS.

Nevertheless, as scientists we are dedicated to identifying and learning how to move our fields of inquiry forward in order to better understand the underlying mechanisms of disease and to identify effective personalized treatments for complex disorders. We have found it is crucial to pay balanced attention to both phenotypic variability and genotypic variability if we are ever to describe the complex development of common and complex medical disorders like the schizophrenias. We do not feel that this public forum is the best place to have this discussion with Breen and colleagues, but here too we may have a philosophical difference. That said, because they have chosen this forum to voice their criticism, we feel it is important that we take the time to address the facts and give people a broader context so that they can understand the arguments and our responses to their concerns. Ultimately, we feel that this is more of a misunderstanding and a miscommunication due to a lack of a common scientific and philosophical approach, and that with time we hope to find more common ground. Ultimately, the data will settle any dispute.

The Facts About Linkage Disequilibrium
Breen and commentators also expressed concern that it is likely that our SNP sets may be merely artifacts of blocks of markers in linkage disequilibrium (LD). LD is strictly defined as the non-random association of alleles of neighboring polymorphisms derived from single ancestral chromosomes, but some broad measures of LD extend the concept to include co-variation of polymorphisms that are not linked, including even associations among genetic variants on separate chromosomes (Reich et al., 2001). Many variables influence co-variation of polymorphisms, including demographic variables (admixture, population size, migration, ancestral population bottlenecks), selection (including epistasis), and variation In recombination rates in different parts of the genome (Slatkin et al., 2008; Wiehe and Slatkin, 1998). Consequently LD is a serious problem for the traditional group-wise statistical approach of traditional GWAS, so care is taken to analyze groups with distinct ancestries separately in order to help disentangle different causes of association. However, in our novel person-centered approach, the identification of subpopulations is an intrinsic aspect of identifying the genotypic-phenotypic architecture. We identify sets of variables that naturally cluster within individual subjects as measured by covariance of polymorphisms or phenotypic traits within particular subgroups of individuals in a population. We identify SNP sets and phenotypic sets independently of one another and then test how these independently identified sets fit together like a lock and a key. In a strictly data-driven manner the hidden structure of the overall population is decomposed into subpopulations of subjects to allow valid tests of genotypic-phenotypic association despite admixture in the total population from which SNP sets and phenotypic sets are extracted (Pritchard and Donnelly, 2001). We allow the possibility that some constituent SNPs of a particular set may be associated (in LD) as adventitious hitchhikers that are closely linked or may be epistatic sets that are functionally adaptive and maintained by selection pressure even though they are unlinked (Koch et al., 2013). However, being linked (co-localized) or in LD is neither a necessary nor a sufficient condition for being a constituent in a SNP set: set membership depends on co-variation of polymorphisms in particular subpopulations of individuals whether the genetic variants are in LD in the total population or not. LD is actually one way that epistatic sets of genetic variants can be maintained in functionally adaptive blocks if the epistatic selection is strong, but most interactive sets of genetic variants are not in LD. Accordingly, we uncover constituents of SNP sets regardless of their LD status as candidates for functionally adaptive epistatic sets. Then we measure their potential functional interaction by testing for their differential association with phenotypic variability. We also consider the known function of the genes and regulatory sites as part of our analysis of the complex pathway from genotypic networks to distinct clinical syndromes. Thus we jointly utilize genotypic, biological, and phenotypic information as part of an integrated systems analysis that allows for observed or hidden stratification in the total population.

In addition to this fundamental difference in conceptual and procedural approach to LD, the concerns of Breen and commentators about our findings are simply unfounded empirically. In total, approximately 2/3 of the SNPs in high-risk sets map to regions that are so far apart in genomic distance (greater than 100,000 base pairs) that they are highly unlikely to be in LD. We found that nine of 42 high-risk SNP sets have some SNPs located on different chromosomes. These facts indicate that the identified SNP sets are not the result of particular genomic constraints such as LD or being within the region of the same gene. In any case, the presence of LD would not explain or invalidate the association of groups of SNPs within a particular SNP set with a particular phenotypic set. For example, one of our SNP sets maps exclusively to SNPs upstream of the NTRK3 gene, as was also found to be strongly associated with schizophrenia by standard GWAS techniques published by the authors of the commentary. In addition SNPs from another SNP set map inside the same gene. Each of these SNP sets involving different components of the NTRK3 gene are associated with different symptoms. Although LD is viewed as a statistical problem for traditional GWAS, in our person-centered approach it is viewed as the result of adaptive mechanisms that can conserve the functional connectivity of epistatic sets of genetic variants, thereby contributing to the differential development of individuals in subpopulations. The functional adaptation facilitated by gene-gene interactions is fundamentally important for healthy development of individuals and for the evolution of populations, as described in Sewall Wright's classical work on complex adaptive systems and evolution (Wright, 1982). Our concurrent consideration of the functions of gene products and associations between different genotypic networks with specific phenotypic syndromes precludes any suggestion that the highly replicable effects we observed are artifacts.

Breen and commentators have also suggested matrix factorization algorithms similar to the methods employed by us have been used to identify regions with long-range LD. This is certainly true and is not a problem in itself. Long-range LD is an indicator of functional connectivity that is not adequately explained by physical proximity, so it is included in what we want to detect in order to account for gene-gene interactions thoroughly (Wu et al., 2010). Matrix factorization methods like ours have been used for most of the current software applications in data-mining, including a wide variety of biomedical problems (Zwir et al., 2005; Zwir et al., 2005; Romero-Zaliz et al., 2008; Harari et al., 2010), facial recognition (Lee and Seung, 1999), gene expression (Mejia-Roa et al., 2008; Pascual-Montano et al., 2006; Tamayo et al., 2007), and other complex problems (Cichocki, 2009). There is no reason to avoid the use of this powerful method for pattern recognition within fuzzy data sets for uncovering hidden order within the complex association of genotypic and phenotypic variables that characterize complex medical disorders.

The Facts About Replication and Significance Testing of SNP Selection
Breen and commentators expressed concern about our replication process. Of course in traditional GWAS, replication has always been a serious problem, which is the basis for the rationale of PGC to carry out meta-analysis of large collections of samples despite their heterogeneity and limited phenotypic description. It was most challenging for us to identify samples with adequate clinical description to apply our novel approach, but the reward was in identifying strong effects that replicated consistently across three samples, including the Portuguese Islands study that used the same diagnostic instrument in a specific ethnic sample. The samples were independently recruited and independently analyzed, as we stated clearly in the published report. SNP sets, phenotypic sets and associations were separately calculated for the three samples to avoid weighted or biased aggregations. Then, we used a well-known co-clustering test based on the hypergeometric distribution to establish the replicability of results from one sample in the other. This test has been used widely in molecular biology (Zwir et al., 2005; Zwir et al., 2005; Tavazoie et al., 1999), and as a general strategy for validating clusters. For example, it has also been implemented into software packages such as TIBCO/Spotfire. The concerns expressed by Breen and commentators about replication have no reasonable justification. Thus we feel that the concerns expressed by Breen and commentators about replication are overstated and empirically unfounded. Again, the strength of this new approach is it allows us to avoid some of the major problems that plague traditional GWAS approaches.

Breen and commentators also expressed their concern about the use of a permutation test, claiming that "because SNP sets differ in allele frequency between cases and controls, this procedure does not generate a valid null distribution." The permutation test was used not to establish the significance of the SNP sets, which was evaluated by the SKAT method (Wu et al., 2010), but rather to test the validity (and approximate probability) of the association between SNP sets and symptom sets. Controls were not used in this test at all, as they have no symptoms of psychosis. Moreover, these symptoms were not even evaluated in the reported inventories. The misunderstanding of Breen and commentators is probably due to a lack of familiarity with this new statistical procedure, which highlights the previously discussed difficulties people have when first trying to understand a novel approach.

Conclusion
We appreciate the opportunity to clarify the fundamental differences between the assumptions and goals of traditional GWAS and our novel approach that addresses the complexity of common disorders with sophisticated and well-validated machine-learning and data-mining methods. We hope that the profound differences in the approaches with which Breen and colleagues are familiar and those developed by us should stimulate greater understanding of the challenges faced by the fields of psychiatric and medical genetics. We recognize that this new approach will cause a period of reexamination of standard methodology in this field, but every major advance in genetics, and in all of science for that matter, has always required flexibility and creative thinking. There are always things that we can improve upon in any method, and we recognize that many incremental improvements are essential for the advance of science.

We have put forth a new data-driven method that allows the uncovering of complex genotypic-phenotypic relations when they are present without imposing this as an a priori assumption. We uncovered relationships are in fact highly complex, which allowed us to identify individuals at high risk and to associate specific SNP clusters with specific clinical syndromes despite the presence of extensive pleiotropy and heterogeneity. This approach, like all those that have preceded it, is undoubtedly imperfect and will also require refinement and may ultimately give way to yet another approach that will explain more. Such methodological evolution is nothing more than the typical course of advancement in science. We hope that these exciting developments will lead to new ways to push the boundaries of accepted science, and help us to question prior assumptions that restrict our understanding of all the information embedded in data.

If this discussion has shown us nothing else, it is that this process of questioning and reflection has already begun. Ultimately, beyond all of the technical issues, our main goal is to help those in need. With schizophrenia, we know the need is great from the tremendous outpouring of requests for guidance and help that we have received, and we know that there are many people with other diseases who may benefit from our new approach. We can all be comforted knowing that our debate can bring us closer to doing what we are really here to do—that is, helping those suffering from debilitating diseases and finding ways to promote their health and well-being. Whatever path leads us there is worth considering. So let us not permit our philosophical or scientific differences to prevent us from allowing for a sufficient diversity in our tactics, because we never know what path will lead us toward our common goals of improving health and reducing the burden of disease.

References
Arnedo J, Svrakic DM, Del Val C, Romero-Zaliz R, Hernandez-Cuervo H,, Fanous AH, Pato MT, Pato CN, de Erausquin GA, Cloninger CR, Zwir I. Uncovering the Hidden Risk Architecture of the Schizophrenias: Confirmation in Three Independent Genome-Wide Association Studies. Am J Psychiatry. 2014 Sep 15. Abstract

Arnedo J, Del Val C, de Erausquin GA, Romero-Zaliz R, Svrakic D, Cloninger CR, Zwir I. PGMRA: a web server for (phenotype x genotype) many-to-many relation analysis in GWAS. Nucleic Acids Res. 2013 Jul; 41(Web Server issue):W142-9. Abstract

Risch N. Linkage strategies for genetically complex traits. I. Multilocus models. Am J Hum Genet. 1990 Feb; 46(2):222-8. Abstract

Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014 Jul 24; 511(7510):421-7.Abstract

Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007 Sep; 81(3):559-75. Abstract

Shi J, Levinson DF, Duan J, Sanders AR, Zheng Y, Pe'er I, Dudbridge F, Holmans PA, Whittemore AS, Mowry BJ, Olincy A, Amin F, Cloninger CR, Silverman JM, Buccola NG, Byerley WF, Black DW, Crowe RR, Oksenberg JR, Mirel DB, Kendler KS, Freedman R, Gejman PV. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature. 2009 Aug 6; 460(7256):753-7. Abstract

Graves JA. Review: Sex chromosome evolution and the expression of sex-specific genes in the placenta. Placenta. 2010 Mar; 31 Suppl():S27-32. Abstract

Reich DE, Cargill M, Bolk S, Ireland J, Sabeti PC, Richter DJ, Lavery T, Kouyoumjian R, Farhadian SF, Ward R, Lander ES. Linkage disequilibrium in the human genome. Nature. 2001 May 10; 411(6834):199-204. Abstract

Slatkin M. Linkage disequilibrium--understanding the evolutionary past and mapping the medical future. Nat Rev Genet. 2008 Jun; 9(6):477-85. Abstract

Wiehe T, Slatkin M. Epistatic selection in a multi-locus Levene model and implications for linkage disequilibrium. Theor Popul Biol. 1998 Feb; 53(1):75-84. Abstract

Pritchard JK, Donnelly P. Case-control studies of association in structured or admixed populations. Theor Popul Biol. 2001 Nov; 60(3):227-37. Abstract

Koch E, Ristroph M, Kirkpatrick M. Long range linkage disequilibrium across the human genome. PLoS One. 2013; 8(12):e80754. Abstract

Wright S. The shifting balance theory and macroevolution. Annu Rev Genet. 1982; 16():1-19. Abstract

Wu MC, Kraft P, Epstein MP, Taylor DM, Chanock SJ, Hunter DJ, Lin X. Powerful SNP-set analysis for case-control genome-wide association studies. Am J Hum Genet. 2010 Jun 11; 86(6):929-42.Abstract

Zwir I, Shin D, Kato A, Nishino K, Latifi T, Solomon F, Hare JM, Huang H, Groisman EA. Dissecting the PhoP regulatory network of Escherichia coli and Salmonella enterica. Proc Natl Acad Sci U S A. 2005 Feb 22; 102(8):2862-7. Abstract

Zwir I, Huang H, Groisman EA. Analysis of differentially-regulated genes within a regulatory network by GPS genome navigation. Bioinformatics. 2005 Nov 15; 21(22):4073-83. Abstract

Romero-Zaliz R, Del Val C, Cobb JP, Zwir I. Onto-CC: a web server for identifying Gene Ontology conceptual clusters. Nucleic Acids Res. 2008 Jul 1; 36(Web Server issue):W352-7. Abstract

Romero-Zaliz R, C. Rubio R, Cordin O, Cobb P, Herrera F, Zwir I. A multi-objective evolutionary conceptual clustering methodology for gene annotation within structural databases: a case of study on the gene ontology database. IEEE Transactions on Evolutionary Computation. 2008;12:6:679-701.

Harari O, Park SY, Huang H, Groisman EA, Zwir I. Defining the plasticity of transcription factor binding sites by Deconstructing DNA consensus sequences: the PhoP-binding sites among gamma/enterobacteria. PLoS Comput Biol. 2010; 6(7):e1000862. Abstract

Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature. 1999 Oct 21; 401(6755):788-91. Abstract

Mejia-Roa E, Carmona-Saez P, Nogales R, Vicente C, Vazquez M, Yang XY, Garcia C, Tirado F, Pascual-Montano A. bioNMF: a web-based tool for nonnegative matrix factorization in biology. Nucleic Acids Res. 2008 Jul 1; 36(Web Server issue):W523-8. Abstract

Pascual-Montano A, Carmona-Saez P, Chagoyen M, Tirado F, Carazo JM, Pascual-Marqui RD. bioNMF: a versatile tool for non-negative matrix factorization in biology. BMC Bioinformatics. 2006; 7():366. Abstract

Tamayo P, Scanfeld D, Ebert BL, Gillette MA, Roberts CW, Mesirov JP. Metagene projection for cross-platform, cross-species characterization of global transcriptional states. Proc Natl Acad Sci U S A. 2007 Apr 3; 104(14):5959-64. Abstract

Cichocki A. Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blinded separation. Chichester, U.K.: John Wiley; 2009.

Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM. Systematic determination of genetic network architecture. Nat Genet. 1999 Jul; 22(3):281-5. Abstract

View all comments by Gabriel de Erausquin