Drosophila SimulansSequencing of all lines has finished. De-novo assemblies using PCAP for each of the individual lines are available from WUGSC. A syntenic assembly has also been created and is described here.
Drosophila YakubaPrimary sequencing has finished and a de-novo assembly is available from WUGSC. A first round of autofinishing has been completed and a second round is in progress, autofinishing sequences have not been incorporated yet into the assembly.
In the Fall of 2003 NHGRI/NIH granted the Genome Sequencing Center (GSC) at Washington University School of Medicine, St. Louis approval to proceed with the sequencing of the Drosophila simulans and D. yakuba genomes. Rick Wilson (Director of the WUGSC) and Sandy Clifton (EST Group Leader and Small Projects Director) are leading the effort at the GSC. Based on early discussions among Drosophila population geneticists, Dave Begun and Chuck Langley proposed that simulans be shotgun sequenced to shallow (1X) coverage in 7 inbred strains and yakuba deeply (up to 8X) in one inbred strain (see white paper). That proposal and subsequent discussions yielded a plan to achieve a solid simulans consensus (reference) sequence and extensive high quality polymorphism data as well as a high quality sequence of the yakuba genome. The project is now far along.
Comparative genome sequencing has the greatest impact on biology when the targeted genomes impinge directly on analysis or interpretation of the human genome or the genome of a genetic model system. Comparative genomics may also shed light on the genetic and evolutionary mechanisms that determine genome organization and composition. The most obvious benefit of comparative genomics has been the discovery of conserved putative functional elements present in each of two distantly related genomes. However, comparisons between distantly related genomes are biased towards identifying only those functional elements that evolve very slowly. Alternatively, comparisons between more recently diverged genomes provide quantitatively critical elements in the analysis of population genomic variation and a clearer view of the mechanisms causing genome evolution. Determining the genome sequences of Drosophila simulans and D. yakuba will greatly facilitate two fundamental goals of genomics research: inferring the mutational and evolutionary mechanisms underlying genome divergence and investigating the causes and consequences of population genomic polymorphism within species.
Project Scope, Concept, and Plans
Drosophila yakuba inbred linepaired reads from 0.3X - fosmids (40kb)
paired reads from 8X - plasmids (4bk)
assembly and annotation
Drosophila simulans - 7 inbred linespaired reads from 1X (each) - plasmids (4kb)
white501: paired reads from 0.3X - fosmids (40kb)
6 other inbred lines (world-wide sampling)
More than 30 labs working on sequences.
In an effort to make the most from these sequences by involving large and active parts of a range of communities, laboratories with the resources to contribute to the early assembly, annotation and analyses of these sequences are analyzing various forms of the data.
The participation of these researchers will assure that the published assembly(s) and annotation(s) are of the highest quality and value, thus forming the best possible basis for a range of early comparative, population and evolutionary genomic analyses that they plan to publish.
Members of these labs that have responded with research proposals, joined the simyak list (chlangley_at_ucdavis.edu) and met together with the Washington University Genomic Science Center staff on the day before the Drosophila Research Conference (http://www.drosophila-conf.org/), 23rd of March, 2004. A second workshop will occur at this years Drosophila Research Comference on at 9 PM on Saturday, April 2 in San Diego.