DNA defined by ancestry

Organization: Los Alamos National Laboratory (U.S. Dept. of Energy)
Co-Developer(s): Joel Berendzen, Principal Developer, Judith Cohn, Nicholas Hengartner, Benjamin McMahon
Year: 2012

2012 R&D 100 WinnerDNA sequencing technology has matured to produce genome sequences quickly and cost effectively, but managing the resulting data is still a hurdle. Sequedex is a software package that performs the first step in analyzing the output of DNA sequencing instruments, classifying the organism the DNA likely came from and the metabolic function it likely carries out.

Developed at Los Alamos National Laboratory, Los Alamos, N.M., the software uses an algorithm and an implementation that classifies sequences at a rate of more than 7 billion bases per hour per CPU core, a speed that is typically 250,000 times faster than the most common software currently used for analyses.

The program—designed to answer the question, "Where have I seen a sequence like this before?"—is based upon the theory of evolution in its modern form of molecular phylogeny (the analysis of hereditary differences in DNA and protein sequences).

Sequedex uses the broadest possible Tree of Life—an ancient concept that all of life on Earth is derived from a common ancestor—to organize its search-and-classification process.

The software exploits existing information-retrieval technology similar to popular Web search engines to get high performance and scalability; it is 1.5 to 5.5 orders-of-magnitude faster than comparable software.

Technology DNA classification software

DevelopersLos Alamos National Laboratory

Development Team

(l-r): Judith Cohn, Benjamin McMahon, Joel Berendzen

 

 

 

 

 

 

 

 

 

Nicolas Hengartner

 

The Sequedex - Software to classify what organism a DNA sequence came from and what it does Development Team from Los Alamos National LaboratoryJoel Berendzen, Principal Developer Judith Cohn Nicholas Hengartner Benjamin McMahon