Molecular Medicine Israel

Massive Transcription Catalog Outlines the Influence of Human Genetic Variation

Combined transcription and genome data from multiple tissues in hundreds of human donors reveal links between genotype and gene expression across the body

Genomic data from more than 400 individuals has been combined with information on genetic activity within multiple tissues from these donors to produce a detailed catalog of the associations between human genotypes and tissue-specific gene expressions. Reports from the Genotype-Tissue Expression (GTEx) Consortium published in Nature today (October 11) describe the procurement and analysis of this extensive dataset, which is now freely available to researchers.

“It’s really wonderful to see these papers,” Jeffrey Barrett, a geneticist at the U.K.’s Wellcome Trust Sanger Institute who was not involved with the research, writes in an email to The Scientist. “They represent the most important resource for gene expression across tissues available right now.”

“The scale of the study is impressive,” adds human geneticist Michelle Ward of the University of Chicago who also did not participate in the work. “Using 44 tissues from 449 individuals, they’ve looked at gene expression [data] and seen how they associate with 12.5 million DNA sites that are known to vary between individuals.” The result, she says, “is the most comprehensive catalog of the associations between genetic variation and gene expression to date.”

The GTEx project was launched in 2010 and “came on the heels of the era of genome-wide association studies (GWAS),” says Kristin Ardlie, who directs the GTEx Laboratory Data Analysis and Coordination Center at the Broad Institute. By analyzing the genomes of many individuals, GWAS aim to identify sequence variants that associate with a given trait, such as a disease.

The majority of GWAS-identified genetic variations turn out to be in noncoding regions of the genome, says Ardlie. “So the hypothesis was that they were probably influencing gene regulation.” Because control over the activity of genes can vary from tissue to tissue, RNA samples from a variety of cell types and a multitude of people would be required to test the hypothesis, Ardlie explains.

The GTEx consortium therefore collected more than 7,000 samples, representing 44 different tissues from hundreds of recently deceased donors. RNA was then extracted and sequenced to ascertain the expression levels of genes in each tissue and the data were tallied against each donor’s genotype—that is, the particular genetic variants in and around each gene.

From this, the team found that the vast majority of genes, if not all of them, have expression-influencing local genetic variants, says Stephen Montgomery, a geneticist at Stanford University and member of the GTEx Consortium. “What that means essentially is that genetic variation is really . . . contributing to the [expression] differences that we see across the population.”

While this result was expected, Montgomery says, “this is the first time that we’ve actually been able to observe that.”

In addition to these locally acting variants, the project also revealed that, for certain genes, variants far away on other chromosomes could also be associated with their expression levels. And importantly, many of the expression-associated genetic variants, wherever they are located, “overlap with genetic loci that have previously been found to be associated with diseases,” says Ward. “So, what this work provides is some insight into genes that may be relevant in disease and the cell types that might be relevant.”

“The information allows us to understand more about the biology of these diseases,” says Montgomery. But, there are other uses for the data that perhaps weren’t appreciated at the project’s outset. “There’s all sorts of other functional genomic patterns that we can look to see if variation is influencing,” he says, such as alternative splicing, the use of alternative transcription start sites, RNA editing, and X chromosome inactivation. Indeed, how genetic variants affect RNA editing and X inactivation are the subjects of two of the consortium’s accompanying papers in Nature.

The GTEx project is far from over. More samples from more individuals are being collected, and novel types of data are being generated from the new and existing tissue samples. In addition to RNA levels, for example, the Consortium is now analyzing protein abundance and chromatin structure. “So instead of just considering the transcriptome, we’re actually getting to a phase where we’re looking at other molecular phenotypes as well,” says Montgomery. An accompanying commentary article in Nature Genetics outlines these future plans.

Sign up for our Newsletter