London: Only about 1 percent of the human genome contains gene regions that code for proteins.
What the rest of the DNA is doing has been a big question- until now.
A new study from a large international team of researchers has found that 80 percent of the genome is biochemically active, and likely involved in regulating the expression of nearby genes, according to a study from a large international team of researchers.
The consortium, known as ENCODE (which stands for “Encyclopedia of DNA Elements”), includes hundreds of scientists from several dozen labs around the world. Using genetic sequencing data from 140 types of cells, the researchers were able to identify thousands of DNA regions that help fine-tune genes’ activity and influence which genes are expressed in different kinds of cells.
Just as the sequencing of the human genome helped scientists learn how mutations in protein-coding genes can lead to disease, the new map of noncoding regions should provide some answers on how mutations in the regulatory elements lead to diseases such as lupus and diabetes, said Manolis Kellis, an associate professor of computer science at MIT, an associate member of the Broad Institute and an author of a paper describing the findings.
“Humans are 99.9 percent identical to each other, and you only have one difference in every 300 to 1,000 nucleotides. What ENCODE allows you to do is provide an annotation of what each nucleotide of the genome does, so that when it’s mutated, we can make some predictions about the consequences of the mutation,” Kellis explained.
Kellis, who leads MIT’s Computational Biology Group, is one of the principal investigators involved in the latest study.
The ENCODE researchers found that 80 percent of the genome experiences some kind of biochemical event, such as binding to proteins that regulate how often a neighboring gene is utilized. They also discovered that the same regulatory region can play different roles, depending on what type of cell it’s acting in.
The researchers also studied the conservation of nucleotides — the A, T, C and G “letters” of DNA — in the newly identified regulatory regions. Nucleotides are conserved if they remain the same over long evolutionary periods, which can be measured by analyzing the variability between species, or among individuals within a species.
A recent paper by Kellis and colleagues showed that 5 percent of noncoding DNA is conserved across mammals. In one of the ENCODE companion papers appearing online Sept. 5 in Science, Kellis and MIT postdoc Lucas Ward show that an additional 4 percent is conserved within the human lineage, suggesting that those elements control recently evolved traits, some of which are unique to humans.
When the researchers looked at the functions of genes near newly evolved regulatory regions, they found many genes that encode regulators that activate other genes.
“Genes involved in the nerve growth pathway and color vision, both of which have been hypothesized to be recent innovations in the primate lineage, are enriched in human-constrained elements in non-conserved regions,” Ward says.
The researchers found that the most highly conserved nucleotides were also the ones most likely to be associated with disease when mutated. They also showed that variants associated with autoimmune diseases such as lupus and rheumatoid arthritis are located in regions active only in immune cells, while variants linked to metabolic diseases are in regions active only in liver cells.
In their next phase, the ENCODE researchers hope to determine just how those variations lead to human disease.
They published their finding in the Sept. 5 online edition of Nature.