ENCODE: The Encyclopedia of DNA Elements
September 5, 2012 12:31 PM   Subscribe

ENCODE: The Encyclopedia of DNA Elements
After five years, the NIH-funded ENCODE Project has unveiled its detailed study of the biochemical context of the human genome. Nature has a special web portal linking together 24 publications in Nature, a special issue of Genome Research, and Genome Biology (all open access). There's also an iPad app to help you navigate through the papers and results. You can look at an enormous poster of results, but it contains only a tiny fraction of the 15 TB of data from the project's >1,600 experiments. Perhaps aerial dance is a better way of portraying what we have learned about genome biology.

Detailed news stories from Nature and Discover magazine describe the project and results. Six independent genomics researchers describe the importance of the project to their own research and the rest of the field. Other coverage from The New York Times, NPR's All Things Considered, BBC News, Wired, The Wall Street Journal, The Los Angeles Times, USA Today, The Guardian, and Reuters. A blog post from the leader of the project's analysis describing the conclusion of this phase of the project and some discussion of the implications to biology and the way biology is done.
Role: scientist
posted by grouse (4 comments total) 11 users marked this as a favorite
This project was posted to MetaFilter by Westringia F. on September 6, 2012: ENCODE: the Encyclopedia of DNA Elements

Our research group is part of the ENCODE project. I am one of four authors of one of the papers in Nature that are featured in this study, and a contributor to a second paper.

We found several interesting things about the human genome:

• control regions are everywhere, in areas previously and poorly described as "junk DNA"
• control regions are cell-specific, even categorizing cancerous and healthy tissues
• these regions, or "footprints" can be discovered reliably with high-throughput sequencing
• we can associate footprints with SNPs, which in turn can be associated with various diseases of genetic provenance
• we can build regulatory networks on a wider scale than previously done

We can find footprints by simply counting the sequencing fragments that show up for a given genomic region. Where there are fewer fragments, we can make inferences as to whether there is protein binding to those specific parts of the human genome being sequenced.

Once we know where those footprints are located, we can associate them with known and putative transcription factors, and start building large-scale regulatory networks that we can compare between cell types, to explore and understand the differences that make a specialized heart muscle cell different from an unspecialized embryonic stem cell, or what differentiates a particular type of cancer tissue from healthy tissue.
posted by Blazecock Pileon at 1:27 PM on September 5, 2012 [7 favorites]

An additional two things to look at:

Video from The Guardian science correspondent Ian Sample explaining what ENCODE brings to the table using ping pong balls and tomatoes.

Reddit thread where I'm responding to questions on the project ("Ask Me Anything").
posted by grouse at 10:31 AM on September 6, 2012 [3 favorites]

A great summary in The Economist.
posted by grouse at 11:45 AM on September 6, 2012 [2 favorites]

A companion paper describing the semi-automated annotation analysis I led is now available, and is open access. It was one of many that wasn't actually published at the same time as the rest.
posted by grouse at 4:47 PM on December 14, 2012

« Older Visualizing the Web Index...   |   Game of Thrones - Iron Throne ... Newer »

You are not currently logged in. Log in or create a new account to post comments.