Genetics Project Update: Over 1,000 Genomes and Counting
Adapted from a story that originally appeared on the Alzheimer Research Forum.
15 November 2012. Your mama always told you that you were special. Now, an international consortium of researchers has clarified just how unique we all are. Each human being harbors several thousand genetic variants, hundreds of which are rare, according to a progress report from the ongoing 1000 Genomes Project in the November 1 Nature. Despite overshooting the titular quantity—it reports on 1,092 genomes from 14 nations—the sequencing consortium is still collecting and reading DNA in its effort to record 2,500 total genomes from a wider geographical area.
While genomewide association studies (GWAS) help researchers discover common genetic variants, the genome and exome sequencing used by the 1000 Genomes group can pick out uncommon ones. Based on a pilot project, the consortium’s leaders elected to perform both low-coverage whole-genome sequencing, reading each strand a handful of times, and deep sequencing, with up to 100 reads of exomes only (see 1000 Genomes Project Consortium, 2010). While exomes encode proteins, scientists in another large project, the Encyclopedia of DNA Elements (ENCODE), recently reported that most of the rest of the genome has important functions, too (see on ENCODE Project Consortium et al., 2012).
The consortium sequenced DNA from 14 different populations, ranging from the Yoruba of Ibadan, Nigeria, to Han Chinese in Beijing, to African-Americans in the southwestern United States. The 1,092 genomes included 38 million single nucleotide polymorphisms (SNPs) and 1.4 million insertions and deletions. That accounts for most common variants, as well as 98 percent of the rare SNPs found in one in 100 people. While common variants are global, rare ones are often limited to small populations. Rare variants also tend to be deleterious to the proteins they encode, the authors reported.
“This type of massive resource helps us to improve our disease research,” said Gerard Schellenberg of the University of Pennsylvania in Philadelphia, who was not involved in the consortium. He and others have been using the 1000 Genomes data for imputation, a method of deduction geneticists use to fill in gaps in GWAS datasets. The typical gene chips used in GWAS identify a long list of single nucleotide polymorphisms, but do not cover all possible variant sites. Since nearby sequences are co-inherited in chunks, knowing some of the linked variants typically allows researchers to predict the others. Imputation accuracy, according to the Nature paper, is 90-95 percent for common variants. Researchers can use sequencing to confirm imputed variants, wrote Philippe Amouyel of the Institut Pasteur de Lille, France, in an e-mail to Alzforum (see full comment, below).
It is no surprise, Schellenberg said, that the consortium found that rare variants tend to appear only in specific populations. However, the 1000 Genome findings underscore how careful geneticists will have to be in matching control populations precisely to case populations, Schellenberg said. Scientists studying a rare variant must take care to show it is truly linked to disease, and not simply the ethnicity or background of the population they are studying, he said. Large numbers of study samples will also be required to find rare, disease-linked variants, added Amouyel, who was not part of the 1000 Genomes team.
Researchers working on Alzheimer’s and other neurodegenerative diseases are doing just that. While GWAS have uncovered common variants that confer relatively low risk for AD, some researchers believe that rare variants will be found that are of much higher risk. Even if these polymorphisms only affect a few people or a single population, they can provide valuable clues about pathological pathways, Schellenberg said. For example, a newly discovered, protective APP variant in Icelanders helped support the β amyloid hypothesis (see Jonsson et al., 2012).—Amber Dance.
The 1000 Genomes Project Consortium, An integrated map of genetic variation from 1,092 human genomes. Nature. 2012 Nov 1;491:56-65. Abstract