<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2007-8-1-r5</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Update of the <it>Anopheles gambiae </it>PEST genome assembly</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Sharakhova</snm>
               <mi>V</mi>
               <fnm>Maria</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>msharakh@vt.edu</email>
            </au>
            <au id="A2">
               <snm>Hammond</snm>
               <mi>P</mi>
               <fnm>Martin</fnm>
               <insr iid="I3"/>
               <email>mhammond@ebi.ac.uk</email>
            </au>
            <au id="A3">
               <snm>Lobo</snm>
               <mi>F</mi>
               <fnm>Neil</fnm>
               <insr iid="I1"/>
               <email>nlobo@nd.edu</email>
            </au>
            <au id="A4">
               <snm>Krzywinski</snm>
               <fnm>Jaroslaw</fnm>
               <insr iid="I1"/>
               <insr iid="I4"/>
               <email>jkrzywin@uta.edu</email>
            </au>
            <au id="A5">
               <snm>Unger</snm>
               <mi>F</mi>
               <fnm>Maria</fnm>
               <insr iid="I1"/>
               <email>munger1@nd.edu</email>
            </au>
            <au id="A6">
               <snm>Hillenmeyer</snm>
               <mi>E</mi>
               <fnm>Maureen</fnm>
               <insr iid="I1"/>
               <insr iid="I5"/>
               <email>maureenh@stanford.edu</email>
            </au>
            <au id="A7">
               <snm>Bruggner</snm>
               <mi>V</mi>
               <fnm>Robert</fnm>
               <insr iid="I1"/>
               <email>rbruggne@nd.edu</email>
            </au>
            <au id="A8">
               <snm>Birney</snm>
               <fnm>Ewan</fnm>
               <insr iid="I2"/>
               <email>birney@ebi.ac.uk</email>
            </au>
            <au id="A9" ca="yes">
               <snm>Collins</snm>
               <mi>H</mi>
               <fnm>Frank</fnm>
               <insr iid="I1"/>
               <email>frank@nd.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Center for Global Health and Infectious Diseases, University of Notre Dame, Galvin Life Sciences Building, Notre Dame, IN 46556-0369, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Entomology, College of Agriculture and Life Science, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061-0319, USA</p>
            </ins>
            <ins id="I3">
               <p>European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK</p>
            </ins>
            <ins id="I4">
               <p>Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA</p>
            </ins>
            <ins id="I5">
               <p>School of Medicine - IDP - Biomedical Informatics, Stanford University, Stanford, CA 94305, USA</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>R5</fpage>
         <url>http://genomebiology.com/2007/8/1/R5</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17210077</pubid>
               <pubid idtype="doi">10.1186/gb-2007-8-1-r5</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>26</day>
               <month>7</month>
               <year>2006</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>24</day>
               <month>10</month>
               <year>2006</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>8</day>
               <month>1</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>08</day>
               <month>01</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Sharakhova et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p><it>Anopheles </it>genome assembly update</p>
      </shorttitle>
      <shortabs>
         <p>An update on the <it>Anopheles </it>gambiae PEST genome assembly places about 33% of previously unmapped sequences on the chromosomes.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The genome of <it>Anopheles gambiae</it>, the major vector of malaria, was sequenced and assembled in 2002. This initial genome assembly and analysis made available to the scientific community was complicated by the presence of assembly issues, such as scaffolds with no chromosomal location, no sequence data for the Y chromosome, haplotype polymorphisms resulting in two different genome assemblies in limited regions and contaminating bacterial DNA.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Polytene chromosome <it>in situ </it>hybridization with cDNA clones was used to place 15 unmapped scaffolds (sizes totaling 5.34 Mbp) in the pericentromeric regions of the chromosomes and oriented a further 9 scaffolds. Additional analysis by <it>in situ </it>hybridization of bacterial artificial chromosome (BAC) clones placed 1.32 Mbp (5 scaffolds) in the physical gaps between scaffolds on euchromatic parts of the chromosomes. The Y chromosome sequence information (0.18 Mbp) remains highly incomplete and fragmented among 55 short scaffolds. Analysis of BAC end sequences showed that 22 inter-scaffold gaps were spanned by BAC clones. Unmapped scaffolds were also aligned to the chromosome assemblies <it>in silico</it>, identifying regions totaling 8.18 Mbp (144 scaffolds) that are probably represented in the genome project by two alternative assemblies. An additional 3.53 Mbp of alternative assembly was identified within mapped scaffolds. Scaffolds comprising 1.97 Mbp (679 small scaffolds) were identified as probably derived from contaminating bacterial DNA. In total, about 33% of previously unmapped sequences were placed on the chromosomes.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>This study has used new approaches to improve the physical map and assembly of the <it>A. gambiae </it>genome.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The genome of <it>Anopheles gambiae</it>, the major vector of malaria in Africa, was sequenced by a whole-genome shotgun approach <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Physical mapping of the genome was conducted by <it>in situ </it>hybridization of about 2,000 bacterial artificial chromosome (BAC) clones on ovarian nurse cell polytene chromosomes. As a result, in the first publication of the <it>A. gambiae </it>physical map, 67 scaffolds equivalent to 227 mega-base-pairs (Mbp) were assigned to chromosomes. Of these, 52 scaffolds were oriented. However, approximately 18% of the assembled <it>A. gambiae </it>genome was represented in scaffolds that did not have a chromosomal location assigned. About 50 Mbp in the assembly were assigned with arbitrary order and orientation to an unmapped chromosome <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. In this study, new approaches were used to improve the physical map and assembly of the <it>A. gambiae </it>genome.</p>
         <p>The most poorly mapped parts of the <it>A. gambiae </it>genome were the pericentromeric regions of the chromosomes. These chromosomal regions are made up of highly and moderately repetitive DNA sequences <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp> that are extremely depleted of genes <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> and form specific heterochromatic structures on chromosomes <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. Pericentromeric heterochromatin plays an important role in many biological processes, such as cell division <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, meiotic pairing <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, regulation of DNA replication and gene expression <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>, and is generally associated with gene silencing <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. However, the assembly and physical mapping of these regions is a difficult part of any genome project <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. In <it>Drosophila melanogaster</it>, for example, one-third of the 180 Mbp genome is centric heterochromatin; but in the first genome publication only 2% of the sequence reads contained heterochromatic simple sequence repeats <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, and only 3 scaffolds corresponding to 3.8 Mb were mapped in centromeric areas <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>.</p>
         <p>According to Cot analysis, 33% (about 86 Mbp) of the <it>A. gambiae </it>genome corresponds to repetitive elements <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. The highest density of repeats is located in pericentromeric regions and forms the completely heterochromatic Y chromosome <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. In contrast with <it>Drosophila</it>, short simple repeats are not expanded in the <it>A. gambiae </it>genome; therefore, cloning of the heterochromatic portion of the genome was more successful. However, in the first publication of the <it>A. gambiae </it>genome only 9 scaffolds, with a total size of 3.3 Mbp, were mapped to pericentromeric regions on chromosomes <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Mapping is difficult because BAC clones representing pericentromeric regions are likely to map to multiple locations due to their high repeat content. In previous work 27 BAC clones hybridized to all centromeric regions on the chromosomes and 116 BAC clones hybridized to pericentromeric regions and multiple locations on the chromosomes <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. To determine the genomic location of heterochromatic scaffolds, cDNA clones from the Normalized Anopheles Pool (ANGNAP1) library with sequences matching regions of these scaffolds were mapped to the chromosomes. Additionally, this approach was used to orient scaffolds that were not previously oriented. For some scaffolds, PCR amplified DNA of genes predicted in the scaffolds was used for physical mapping.</p>
         <p>No sequence data were assigned to the Y chromosome in the original publication of the <it>A. gambiae </it>genome <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Subsequent studies revealed numerous repetitive sequences on the Y chromosome, including four families of satellite DNA and a massive accumulation of several transposable elements, consistent with the fully heterochromatic nature of that chromosome <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. Only one Y-specific scaffold contained an open reading frame that appeared to correspond to a gene fragment and was expressed exclusively in males. However, a recent extensive bioinformatics-based search failed to reveal other Y-linked scaffolds containing gene sequences <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>.</p>
         <p>Another significant problem in the <it>A. gambiae </it>genome assembly was the existence of 64 physical gaps between the mapped scaffolds <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. BAC and cDNA clones were used for <it>in situ </it>hybridization to physically assign 5 scaffolds with a total size of 1.3 Mbp in these gaps. In addition, systematic <it>in silico </it>analysis of BAC end sequences (BESs) from the ND-1 <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and ND-TAM <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> BAC libraries identified BAC clones that span a third of the physical gaps. The sequencing of these clones would allow further improvement of the <it>A. gambiae </it>genome assembly.</p>
         <p>Genetic variation within the <it>A. gambiae </it>genome posed another challenge for mapping and assembly <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. <it>A. gambiae </it>is a highly polymorphic species, characterized by the presence of five chromosomal forms (Bamako, Mopti, Savanna, Bissau and Forest); sympatric populations of the Bamako, Mopti and Savanna forms are at least partially isolated from each other in Mali. These chromosomal forms can be identified by paracentric inversions of the 2R chromosomal arm and have different adaptation to certain climatic conditions and human environments <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>. Moreover, an additional type of polymorphism, termed M and S molecular forms, has been revealed in natural populations of <it>A. gambiae </it>by differences in ribosomal DNA <abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>. The PEST strain, selected for the genome sequencing because it had the standard chromosomal arrangement, was produced by crossing a laboratory strain originating in Nigeria with the offspring of field-collected <it>A. gambiae </it>from western Kenya <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. As a result of the high level of polymorphism within the strain, some regions of the genome appeared as two different assemblies ('haplotypes') within the set of scaffolds. Holt <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> estimated that the presence of alternative assemblies led to overrepresentation of the size of the genome by about 21.3 Mbp. Additional analysis of the scaffold sequences <it>in silico </it>identified 144 previously unplaced scaffolds totaling 8.18 Mbp that are probable alternative assemblies of regions already placed on the chromosomes. In addition, 20 cases totaling 3.53 Mbp of sequence were identified where the adjacent ends of large mapped scaffolds appear to be alternative assemblies of the same region.</p>
         <p>The genomic libraries used for the sequencing of the <it>A. gambiae </it>genome were contaminated by bacterial DNA <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. By bioinformatics approaches, 679 scaffolds with a total size of 1.97 Mbp were determined to be derived from contaminating bacterial DNA.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>The revised <it>A. gambiae </it>PEST assembly is available at GenBank. The scaffold entries have information about alternative assembly regions and all other corresponding information. The new RefSeq entries reflect the revised chromosome assemblies (GenBank: <ext-link ext-link-type="gen" ext-link-id="CM000356">CM000356</ext-link>-<ext-link ext-link-type="gen" ext-link-id="CM000360">CM000360</ext-link>).</p>
         <sec>
            <st>
               <p>Physical mapping and scaffold orientation in the pericentromeric regions</p>
            </st>
            <p>Pericentromeric regions are probably under-represented in the genome assembly because the scaffolds from these regions, although assembled, cannot be localized on the chromosome. The likely reason they fail to localize correctly is that they contain a large percentage of highly repeated sequences, so large probes such as the BAC clones previously used to map the scaffolds give ambiguous results - hybridization to multiple regions. This was overcome by using unique sequences in the scaffolds as probes for <it>in situ </it>hybridization on the ovarian polytene chromosomes. A good source of unique sequences is cDNAs from unique genes encoded in the scaffolds. To detect the genes in the unassigned scaffolds, cDNA sequences from the ANGNAP1 library were compared to the scaffold sequences. Clones representing unique sequences near the ends of the scaffolds were selected for use as probes for <it>in situ </it>hybridization. To differentiate scaffold ends, cDNAs from opposite ends were labeled with red Cy3 and blue Cy5 dyes. Typical results from <it>in situ </it>hybridizations using this technique are shown in Figure <figr fid="F1">1</figr>. The results from numerous hybridization experiments demonstrated that the scaffolds could generally be oriented on the polytene chromosome when the labeled target sequences were located more than 100 kb apart on the scaffold. In some cases, cDNAs were not available to represent the unique sequences at the end of a scaffold; in those cases, probes were made by PCR amplification of unique sequence from BAC clones. Although the use of PCR amplified genes was less successful than the use of cDNA sequences, three scaffolds (AAAB01008973, AAAB01008949 and AAAB01008942) were positioned by this technique.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Results of <it>in situ </it>hybridization of cDNA clones to the heterochromatic regions on the polytene chromosomes of <it>A. gambiae</it></p>
               </caption>
               <text>
                  <p>Results of <it>in situ </it>hybridization of cDNA clones to the heterochromatic regions on the polytene chromosomes of <it>A. gambiae</it>. Two cDNA clones were labeled with red Cy3 and blue Cy5 dyes and hybridized to the polytene chromosomes: the red signals indicate the beginning and the blue signals show the end of the scaffolds. The location of the scaffolds, <b>(a) </b>AAAB01008973, <b>(b) </b>AAAB01008961 and <b>(c) </b>AAAB01008971, were indicated by <it>in situ </it>hybridization of the cDNA clones: ANP1272B11, ANP1141F09 (a); ANP1302A01, ANP1344A01 (b) and ANP131B08, ANP121D04 (c) on the chromosome X (a), 2 (b) and 3 (c).</p>
               </text>
               <graphic file="gb-2007-8-1-r5-1"/>
            </fig>
            <sec>
               <st>
                  <p>Chromosome X</p>
               </st>
               <p>By cDNA and PCR fragment physical mapping, four scaffolds with lengths between 400 and 600 kbp were placed in the pericentromeric region on chromosome X (Additional data file 1). Scaffolds AAAB01008973 and AAAB01008858 were localized and oriented to the distal part of the X chromosome pericentromeric region 6 (Figure <figr fid="F2">2a</figr>). Scaffolds AAAB01008967 and AAAB01008976 were mapped more proximally to the pericentromeric region (Figure <figr fid="F2">2a</figr>). Orientation of these scaffolds was not possible because cDNA clones within the scaffolds hybridized to the same places on the chromosome. By the same method, previously mapped but unoriented scaffolds AAAB01008975 and AAAB01008885 were oriented (Figure <figr fid="F2">2a</figr>). Scaffold AAAB01008861 was oriented and mapped more precisely as the most proximal scaffold on the X chromosome (Figure <figr fid="F2">2a</figr>). Two cDNA clones from that scaffold were localized to the most proximal part of the pericentomeric heterochromatin. cDNA clone ANGNAP1293B02, which hybridized to the condensed heterochromatic band on the X chromosome, also labeled nucleoli in all cells on the slide. BLASTN analysis demonstrated significant similarity of this cDNA to ribosomal RNA genes of <it>A. albimanus </it>and <it>Aedes albopictus</it>. These genes are not currently annotated in the <it>A. gambiae </it>genome. This area of the X chromosome has a significant number of gaps, which may have hindered <it>in silico </it>annotation. BLASTN analysis also showed localization of <it>A. gambiae </it>ribosomal RNA genes in scaffold AAAB01008976, which is the adjacent mapped scaffold. Thus, this area can be described as a nucleolar-organizing region for the polytene X chromosome.</p>
               <fig id="F2">
                  <title>
                     <p>Figure 2</p>
                  </title>
                  <caption>
                     <p>Scaffolds located in pericentromeric regions on <it>A. gambiae </it>chromosomes</p>
                  </caption>
                  <text>
                     <p>Scaffolds located in pericentromeric regions on <it>A. gambiae </it>chromosomes. Black and red lines and arrows on the left side of the picture correspond to the scaffolds previously and newly mapped to the pericentromeric regions of chromosomes <b>(a) </b>X, <b>(b) </b>2 and <b>(c) </b>3, respectively; blue arrows indicate newly oriented scaffolds. The dots on the arrows show the beginning of the scaffolds and the arrowheads correspond to the end of the scaffolds. The scaffolds are identified by the last four digits of the scaffold ID. The scale on the left side of the chromosomes indicates divisions and subdivisions in these regions. Black arrows on the right side of the picture show the location of the PCR amplified gene-fragments and BAC and cDNA clones.</p>
                  </text>
                  <graphic file="gb-2007-8-1-r5-2"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Chromosome 2</p>
               </st>
               <p>Four scaffolds were mapped to the pericentromeric region of chromosome 2 (Additional data file 1). Two scaffolds AAAB01008949 and AAAB01008897 were placed on the right arm (2R) of this chromosome (Figure <figr fid="F2">2b</figr>). Scaffold AAAB01008942 was assigned to the very proximal end of the 2L arm (Figure <figr fid="F2">2b</figr>), and scaffold AAAB01008026 was mapped on the distal part of region 20A (Figure <figr fid="F2">2b</figr>). Both scaffolds mapped to the 2L arm have been oriented. The distal boundary of scaffold AAAB01008987 was also mapped to the telomeric region of the 2R arm. The last BAC clone, 170B21, from this scaffold hybridized to the pair of distal dark bands in subdivision 7A. Additional analysis of previously mapped BAC clones showed that the telomeric end of the 2L chromosomal arm was covered by scaffold AAAB01008807.</p>
            </sec>
            <sec>
               <st>
                  <p>Chromosome 3</p>
               </st>
               <p>Eight scaffolds were assigned to the pericentromeric region of chromosome 3 (Additional data file 1; Figure <figr fid="F2">2c</figr>). Scaffold AAAB01008822 was localized on the distal part of region 37D on the 3R arm. Scaffold AAAB01008943 covered the proximal part of this region and reached the centromeric block of 3R. Scaffolds AAAB01008957, AAAB01008972, AAAB01008985 and AAAB01008981 have been placed in region 38A of the 3L arm. Only two of these scaffolds, AAAB01008943 and AAAB01008972, have been oriented. In addition, scaffold AAAB01008795 was oriented in the region 38C, and the distal boundary of the scaffold AAAB01008849 was more precisely mapped in region 38C (Figure <figr fid="F2">2c</figr>).</p>
               <p>The gene content and amounts of transposable elements and short simple repeats were compared between euchromatic and heterochromatic scaffolds across all chromosomes. Gene density in heterochromatin varies but, on average (2 per 100 kbp), is 40% that of the gene density in euchromatin (5 per 100 kbp). In the most centromeric scaffolds, gene content was as low as 0.2 per 100 kbp. The most significant components of the heterochromatic scaffolds are transposable elements: about 50% of the sequences were found to have similarities to the known transposable elements. The content of repeat elements shorter than 200 bp is two-fold higher in heterochromatic scaffolds (8.7%) than in euchromatic scaffolds (4.7%).</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Y-linked scaffolds</p>
            </st>
            <p>Recently, four satellite DNA families were reported from the male-specific Y chromosome <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>; however, the complete list of scaffolds harboring these satellite sequences was not published. In the present study, 54 such scaffolds have been identified using BLASTN searches. All 54 scaffolds are considered here as Y-linked (Additional data file 2). They usually have short sequences and are composed entirely of repeats of a given satellite family. The few exceptions correspond to scaffolds with juxtaposed arrays of different satellite families or of a satellite DNA and a transposable element fragment. Scaffolds containing Y-linked satellite DNA have a total length of 134 kb. Including the Y-linked scaffold detected previously <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, the overall length of the Y chromosome scaffolds identified in the <it>A. gambiae </it>genome reaches only 182 kb, making it still the most poorly explored part of the genome. None of the Y-specific scaffolds have been physically mapped, as <it>in situ </it>hybridization experiments were conducted only on polytene chromosomes from ovarian nurse cells.</p>
         </sec>
         <sec>
            <st>
               <p>Assembly improvement in euchromatic regions</p>
            </st>
            <p>Analysis of BAC clones by <it>in situ </it>hybridization was used to assign four additional scaffolds (AAAB01008862, AAAB01008456, AAAB01008882, AAAB01008090) to the euchromatic chromosomal regions (Additional data file 1). In addition, scaffold AAAB01008838 was assigned to an inter-scaffold gap on 3L by <it>in situ </it>hybridization of cDNA clones. Scaffold AAAB01008882 was not included in the final 2L chromosome assembly because of the possibility of some miss-assembly.</p>
            <p>A BLASTN analysis of BESs was utilized to identify 94 BAC clones that mapped in the vicinity of the 36 inter-scaffold gaps between scaffolds placed on chromosomes. These BAC clones were further examined manually to identify those that spanned gaps. On chromosome 2R, 13 BAC clones were identified that covered 11 gaps. Two BAC clones were identified on 2L that spanned one gap, two BAC clones on 3R covered two gaps and 19 BAC clones were identified on 3L that crossed a total of eight gaps. No BAC clones were identified that covered gaps on the X chromosome (Additional data file 3). In total, 36 BAC clones were identified that could be used to sequence through 22 gaps on the <it>Anopheles </it>genome assembly. As discussed below, 12 of these 22 gaps have also been bridged by finding that adjacent scaffold ends appear to represent alternative assemblies of the same region (Table <tblr tid="T1">1</tblr>).</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Scaffolds from the current <it>Anopheles gambiae </it>genome golden path</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="left">
                        <p>No.</p>
                     </c>
                     <c ca="left">
                        <p>Scaffold accession number</p>
                     </c>
                     <c ca="center">
                        <p>Full length of scaffold*</p>
                     </c>
                     <c ca="center">
                        <p>Scaffold begin</p>
                     </c>
                     <c ca="center">
                        <p>Scaffold end</p>
                     </c>
                     <c ca="left">
                        <p>Assembly status to the next scaffold</p>
                     </c>
                     <c ca="left">
                        <p>BAC clones crossing gap between current and next scaffold</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>X chromosome</p>
                     </c>
                     <c ca="center">
                        <p>Telomere end</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008846</p>
                     </c>
                     <c ca="center">
                        <p>11308833</p>
                     </c>
                     <c ca="center">
                        <p>4C</p>
                     </c>
                     <c ca="center">
                        <p>1D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008847</p>
                     </c>
                     <c ca="center">
                        <p>3715079</p>
                     </c>
                     <c ca="center">
                        <p>1D</p>
                     </c>
                     <c ca="center">
                        <p>5A</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008963</p>
                     </c>
                     <c ca="center">
                        <p>2230633</p>
                     </c>
                     <c ca="center">
                        <p>5A</p>
                     </c>
                     <c ca="center">
                        <p>5C</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008811</p>
                     </c>
                     <c ca="center">
                        <p>3062431</p>
                     </c>
                     <c ca="center">
                        <p>5C</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008973</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>600295</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008958</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>589940</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008852</p>
                     </c>
                     <c ca="center">
                        <p>409660</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008975</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>935344</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008885</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>267815</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008967</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>438965</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01007622</p>
                     </c>
                     <c ca="center">
                        <p>14705</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008976</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>589797</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>13</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008861</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>109611</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>X chromosome</p>
                     </c>
                     <c ca="center">
                        <p>Centromere end</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>2R chromosome</p>
                     </c>
                     <c ca="center">
                        <p>Telomere end</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>14</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008987</p>
                     </c>
                     <c ca="center">
                        <p>16222597</p>
                     </c>
                     <c ca="center">
                        <p>10D</p>
                     </c>
                     <c ca="center">
                        <p>7A</p>
                     </c>
                     <c ca="left">
                        <p>Bridged</p>
                     </c>
                     <c ca="left">
                        <p>17O20</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008799</p>
                     </c>
                     <c ca="center">
                        <p>2774677</p>
                     </c>
                     <c ca="center">
                        <p>11B</p>
                     </c>
                     <c ca="center">
                        <p>10D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008859</p>
                     </c>
                     <c ca="center">
                        <p>12516315</p>
                     </c>
                     <c ca="center">
                        <p>13E</p>
                     </c>
                     <c ca="center">
                        <p>11B</p>
                     </c>
                     <c ca="left">
                        <p>Bridged</p>
                     </c>
                     <c ca="left">
                        <p>122O11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008879</p>
                     </c>
                     <c ca="center">
                        <p>2921310</p>
                     </c>
                     <c ca="center">
                        <p>14A</p>
                     </c>
                     <c ca="center">
                        <p>14C</p>
                     </c>
                     <c ca="left">
                        <p>Joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>18</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008794</p>
                     </c>
                     <c ca="center">
                        <p>932688</p>
                     </c>
                     <c ca="center">
                        <p>14D</p>
                     </c>
                     <c ca="center">
                        <p>14D</p>
                     </c>
                     <c ca="left">
                        <p>Joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>19</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008982</p>
                     </c>
                     <c ca="center">
                        <p>1015562</p>
                     </c>
                     <c ca="center">
                        <p>14D</p>
                     </c>
                     <c ca="center">
                        <p>14E</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>155O10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008904</p>
                     </c>
                     <c ca="center">
                        <p>1759265</p>
                     </c>
                     <c ca="center">
                        <p>14E</p>
                     </c>
                     <c ca="center">
                        <p>15B</p>
                     </c>
                     <c ca="left">
                        <p>Joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>21</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008851</p>
                     </c>
                     <c ca="center">
                        <p>2082253</p>
                     </c>
                     <c ca="center">
                        <p>15C</p>
                     </c>
                     <c ca="center">
                        <p>15C</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>179J17</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>22</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008820</p>
                     </c>
                     <c ca="center">
                        <p>590116</p>
                     </c>
                     <c ca="center">
                        <p>15D</p>
                     </c>
                     <c ca="center">
                        <p>15C</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>30L16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>23</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008888</p>
                     </c>
                     <c ca="center">
                        <p>3396474</p>
                     </c>
                     <c ca="center">
                        <p>15D</p>
                     </c>
                     <c ca="center">
                        <p>16A</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>24</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008844</p>
                     </c>
                     <c ca="center">
                        <p>2866027</p>
                     </c>
                     <c ca="center">
                        <p>16B</p>
                     </c>
                     <c ca="center">
                        <p>16D</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>21H06</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>25</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008805</p>
                     </c>
                     <c ca="center">
                        <p>646796</p>
                     </c>
                     <c ca="center">
                        <p>16D</p>
                     </c>
                     <c ca="center">
                        <p>16D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>26</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008862</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>212521</p>
                     </c>
                     <c ca="center">
                        <p>16D</p>
                     </c>
                     <c ca="center">
                        <p>16D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>27</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008978</p>
                     </c>
                     <c ca="center">
                        <p>1934381</p>
                     </c>
                     <c ca="center">
                        <p>16D</p>
                     </c>
                     <c ca="center">
                        <p>17B</p>
                     </c>
                     <c ca="left">
                        <p>Bridged</p>
                     </c>
                     <c ca="left">
                        <p>124P12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>28</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008817</p>
                     </c>
                     <c ca="center">
                        <p>1590424</p>
                     </c>
                     <c ca="center">
                        <p>17C</p>
                     </c>
                     <c ca="center">
                        <p>17C</p>
                     </c>
                     <c ca="left">
                        <p>Bridged</p>
                     </c>
                     <c ca="left">
                        <p>16N20, 105N12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>29</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008880</p>
                     </c>
                     <c ca="center">
                        <p>4233641</p>
                     </c>
                     <c ca="center">
                        <p>18A</p>
                     </c>
                     <c ca="center">
                        <p>18C</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>30</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008898</p>
                     </c>
                     <c ca="center">
                        <p>4120773</p>
                     </c>
                     <c ca="center">
                        <p>18C</p>
                     </c>
                     <c ca="center">
                        <p>19C</p>
                     </c>
                     <c ca="left">
                        <p>Bridged</p>
                     </c>
                     <c ca="left">
                        <p>105P15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>31</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008952</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1118246</p>
                     </c>
                     <c ca="center">
                        <p>19D</p>
                     </c>
                     <c ca="center">
                        <p>19C</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>32</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008961</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>516376</p>
                     </c>
                     <c ca="center">
                        <p>19D</p>
                     </c>
                     <c ca="center">
                        <p>19D</p>
                     </c>
                     <c ca="left">
                        <p>Bridged</p>
                     </c>
                     <c ca="left">
                        <p>174H20, 127O12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>33</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008850</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>840256</p>
                     </c>
                     <c ca="center">
                        <p>19D</p>
                     </c>
                     <c ca="center">
                        <p>19D</p>
                     </c>
                     <c ca="left">
                        <p>Bridged</p>
                     </c>
                     <c ca="left">
                        <p>07F16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>34</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008977</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>457753</p>
                     </c>
                     <c ca="center">
                        <p>19D</p>
                     </c>
                     <c ca="center">
                        <p>19D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>35</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008949</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>335163</p>
                     </c>
                     <c ca="center">
                        <p>19E</p>
                     </c>
                     <c ca="center">
                        <p>19E</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>36</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008897</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>259841</p>
                     </c>
                     <c ca="center">
                        <p>19E</p>
                     </c>
                     <c ca="center">
                        <p>19E</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>2R chromosome</p>
                     </c>
                     <c ca="center">
                        <p>Centromere end</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>2L chromosome</p>
                     </c>
                     <c ca="center">
                        <p>Centromere end</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>37</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008942</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>373146</p>
                     </c>
                     <c ca="center">
                        <p>20A</p>
                     </c>
                     <c ca="center">
                        <p>20A</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>38</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008026</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>124951</p>
                     </c>
                     <c ca="center">
                        <p>20A</p>
                     </c>
                     <c ca="center">
                        <p>20A</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>39</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008864</p>
                     </c>
                     <c ca="center">
                        <p>318965</p>
                     </c>
                     <c ca="center">
                        <p>20B</p>
                     </c>
                     <c ca="center">
                        <p>20B</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>40</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008968</p>
                     </c>
                     <c ca="center">
                        <p>3184012</p>
                     </c>
                     <c ca="center">
                        <p>20D</p>
                     </c>
                     <c ca="center">
                        <p>20B</p>
                     </c>
                     <c ca="left">
                        <p>Bridged</p>
                     </c>
                     <c ca="left">
                        <p>01I16, 01J12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>41</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008905</p>
                     </c>
                     <c ca="center">
                        <p>1027887</p>
                     </c>
                     <c ca="center">
                        <p>20D</p>
                     </c>
                     <c ca="center">
                        <p>20D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>42</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008948</p>
                     </c>
                     <c ca="center">
                        <p>3345744</p>
                     </c>
                     <c ca="center">
                        <p>21A</p>
                     </c>
                     <c ca="center">
                        <p>21B</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>43</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008456</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>41134</p>
                     </c>
                     <c ca="center">
                        <p>21B</p>
                     </c>
                     <c ca="center">
                        <p>21B</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>44</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008827</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>28099</p>
                     </c>
                     <c ca="center">
                        <p>21B</p>
                     </c>
                     <c ca="center">
                        <p>21B</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>45</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008900</p>
                     </c>
                     <c ca="center">
                        <p>4906461</p>
                     </c>
                     <c ca="center">
                        <p>21C</p>
                     </c>
                     <c ca="center">
                        <p>21F</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008810</p>
                     </c>
                     <c ca="center">
                        <p>494023</p>
                     </c>
                     <c ca="center">
                        <p>21F</p>
                     </c>
                     <c ca="center">
                        <p>22A</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>47</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008960</p>
                     </c>
                     <c ca="center">
                        <p>23099915</p>
                     </c>
                     <c ca="center">
                        <p>22A</p>
                     </c>
                     <c ca="center">
                        <p>25D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>48</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008807</p>
                     </c>
                     <c ca="center">
                        <p>12309988</p>
                     </c>
                     <c ca="center">
                        <p>28D</p>
                     </c>
                     <c ca="center">
                        <p>25D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>2Lchromosome</p>
                     </c>
                     <c ca="center">
                        <p>Telomere end</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>3R chromosome</p>
                     </c>
                     <c ca="center">
                        <p>Telomere end</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>49</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008964</p>
                     </c>
                     <c ca="center">
                        <p>12399987</p>
                     </c>
                     <c ca="center">
                        <p>30E</p>
                     </c>
                     <c ca="center">
                        <p>29A</p>
                     </c>
                     <c ca="left">
                        <p>Joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>50</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008944</p>
                     </c>
                     <c ca="center">
                        <p>6709423</p>
                     </c>
                     <c ca="center">
                        <p>30E</p>
                     </c>
                     <c ca="center">
                        <p>31D</p>
                     </c>
                     <c ca="left">
                        <p>Joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>51</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008984</p>
                     </c>
                     <c ca="center">
                        <p>12483120</p>
                     </c>
                     <c ca="center">
                        <p>32A</p>
                     </c>
                     <c ca="center">
                        <p>33D</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>11E04</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>52</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008835</p>
                     </c>
                     <c ca="center">
                        <p>1771096</p>
                     </c>
                     <c ca="center">
                        <p>33D</p>
                     </c>
                     <c ca="center">
                        <p>34A</p>
                     </c>
                     <c ca="left">
                        <p>Joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>53</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008797</p>
                     </c>
                     <c ca="center">
                        <p>1002333</p>
                     </c>
                     <c ca="center">
                        <p>34A</p>
                     </c>
                     <c ca="center">
                        <p>34B</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>54</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008839</p>
                     </c>
                     <c ca="center">
                        <p>2408169</p>
                     </c>
                     <c ca="center">
                        <p>34C</p>
                     </c>
                     <c ca="center">
                        <p>34B</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>08B11</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>55</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008980</p>
                     </c>
                     <c ca="center">
                        <p>16417966</p>
                     </c>
                     <c ca="center">
                        <p>34C</p>
                     </c>
                     <c ca="center">
                        <p>37D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>56</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008822</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>56627</p>
                     </c>
                     <c ca="center">
                        <p>37D</p>
                     </c>
                     <c ca="center">
                        <p>37D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>57</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008971</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>377729</p>
                     </c>
                     <c ca="center">
                        <p>37D</p>
                     </c>
                     <c ca="center">
                        <p>37D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>58</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008943</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>173243</p>
                     </c>
                     <c ca="center">
                        <p>37D</p>
                     </c>
                     <c ca="center">
                        <p>37D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>3R chromosome</p>
                     </c>
                     <c ca="center">
                        <p>Centromere end</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>3L chromosome</p>
                     </c>
                     <c ca="center">
                        <p>Centromere end</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>59</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008981</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>219224</p>
                     </c>
                     <c ca="center">
                        <p>38A</p>
                     </c>
                     <c ca="center">
                        <p>38A</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>60</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008985</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>236235</p>
                     </c>
                     <c ca="center">
                        <p>38A</p>
                     </c>
                     <c ca="center">
                        <p>38A</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>61</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008972</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>744379</p>
                     </c>
                     <c ca="center">
                        <p>38A</p>
                     </c>
                     <c ca="center">
                        <p>38A</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>62</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008957</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>222192</p>
                     </c>
                     <c ca="center">
                        <p>38A</p>
                     </c>
                     <c ca="center">
                        <p>38A</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>63</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008849</p>
                     </c>
                     <c ca="center">
                        <p>2994010</p>
                     </c>
                     <c ca="center">
                        <p>38C</p>
                     </c>
                     <c ca="center">
                        <p>38B</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>64</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008906</p>
                     </c>
                     <c ca="center">
                        <p>127247</p>
                     </c>
                     <c ca="center">
                        <p>38C</p>
                     </c>
                     <c ca="center">
                        <p>38C</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>65</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008795</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>347814</p>
                     </c>
                     <c ca="center">
                        <p>38C</p>
                     </c>
                     <c ca="center">
                        <p>38C</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>66</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008933</p>
                     </c>
                     <c ca="center">
                        <p>2255294</p>
                     </c>
                     <c ca="center">
                        <p>38C</p>
                     </c>
                     <c ca="center">
                        <p>39B</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>67</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008838</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>221210</p>
                     </c>
                     <c ca="center">
                        <p>39A</p>
                     </c>
                     <c ca="center">
                        <p>39A</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>68</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008796</p>
                     </c>
                     <c ca="center">
                        <p>270581</p>
                     </c>
                     <c ca="center">
                        <p>39B</p>
                     </c>
                     <c ca="center">
                        <p>39B</p>
                     </c>
                     <c ca="left">
                        <p>Bridged</p>
                     </c>
                     <c ca="left">
                        <p>22E23, 119N12</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>69</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008848</p>
                     </c>
                     <c ca="center">
                        <p>1927899</p>
                     </c>
                     <c ca="center">
                        <p>39B</p>
                     </c>
                     <c ca="center">
                        <p>39C</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>133F8, 12F24</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>70</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008979</p>
                     </c>
                     <c ca="center">
                        <p>1577277</p>
                     </c>
                     <c ca="center">
                        <p>40A</p>
                     </c>
                     <c ca="center">
                        <p>39C</p>
                     </c>
                     <c ca="left">
                        <p>Joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>71</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008951</p>
                     </c>
                     <c ca="center">
                        <p>359421</p>
                     </c>
                     <c ca="center">
                        <p>40A</p>
                     </c>
                     <c ca="center">
                        <p>40A</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>02H06</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>72</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008823</p>
                     </c>
                     <c ca="center">
                        <p>3392972</p>
                     </c>
                     <c ca="center">
                        <p>41A</p>
                     </c>
                     <c ca="center">
                        <p>40B</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>04P06, 10C06, 08F18</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>73</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008793</p>
                     </c>
                     <c ca="center">
                        <p>402616</p>
                     </c>
                     <c ca="center">
                        <p>41A</p>
                     </c>
                     <c ca="center">
                        <p>41A</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>160H13</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>74</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008804</p>
                     </c>
                     <c ca="center">
                        <p>907607</p>
                     </c>
                     <c ca="center">
                        <p>41B</p>
                     </c>
                     <c ca="center">
                        <p>41A</p>
                     </c>
                     <c ca="left">
                        <p>Joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>75</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008816</p>
                     </c>
                     <c ca="center">
                        <p>6058108</p>
                     </c>
                     <c ca="center">
                        <p>42B</p>
                     </c>
                     <c ca="center">
                        <p>41B</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>76</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008966</p>
                     </c>
                     <c ca="center">
                        <p>3863510</p>
                     </c>
                     <c ca="center">
                        <p>43B</p>
                     </c>
                     <c ca="center">
                        <p>42C</p>
                     </c>
                     <c ca="left">
                        <p>Joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>77</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008956</p>
                     </c>
                     <c ca="center">
                        <p>1048260</p>
                     </c>
                     <c ca="center">
                        <p>43B</p>
                     </c>
                     <c ca="center">
                        <p>43B</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>128N23</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>78</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>AAAB01008090</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>381451</p>
                     </c>
                     <c ca="center">
                        <p>43B</p>
                     </c>
                     <c ca="center">
                        <p>43B</p>
                     </c>
                     <c ca="left">
                        <p>Joined and bridged</p>
                     </c>
                     <c ca="left">
                        <p>131N20, 143K17, 23C10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>79</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008834</p>
                     </c>
                     <c ca="center">
                        <p>2541584</p>
                     </c>
                     <c ca="center">
                        <p>43D</p>
                     </c>
                     <c ca="center">
                        <p>43B</p>
                     </c>
                     <c ca="left">
                        <p>Bridged</p>
                     </c>
                     <c ca="left">
                        <p>12N18, 08K01, 102F22, 172A24, 19C24, 23K03</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>80</p>
                     </c>
                     <c ca="left">
                        <p>AAAB01008986</p>
                     </c>
                     <c ca="center">
                        <p>12698247</p>
                     </c>
                     <c ca="center">
                        <p>46D</p>
                     </c>
                     <c ca="center">
                        <p>43D</p>
                     </c>
                     <c ca="left">
                        <p>Not joined</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>3L chromosome</p>
                     </c>
                     <c ca="center">
                        <p>Telomere end</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*This represents the original scaffold lengths. When adjacent scaffolds overlap, part of one of the scaffolds was designated as an alternative assembly and excluded from the chromosome assembly (see Additional data file 6). The 28 scaffolds shown in bold have been newly mapped or oriented.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Detection of polymorphic and bacterial specific scaffolds</p>
            </st>
            <p>To identify scaffolds from polymorphic regions, unmapped scaffolds were aligned to the chromosome assemblies using the program exonerate <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, allowing alignments to extend through gaps and possible insertions and deletions in the scaffolds. This revealed 144 scaffolds, of sizes between 15 kbp and 415 kbp and totaling 8.186 Mbp, that aligned over their entire length to a previously mapped, larger scaffold (Additional data file 4). The two aligned alternatives for these regions differed in sequence by between 1.2% and 4.6%, with 90% of the pairs showing sequence differences within the range 1.7% to 3.7%. Such scaffolds probably represent alternative assemblies of the chromosome region and indicate parts of the genome where two haplotypes may have been segregating within the sequenced PEST strain. Seven scaffolds from this list were also physically mapped to appropriate chromosomal locations.</p>
            <p>Because such alternative assemblies could also occur at the ends of adjacent scaffolds, physically mapped scaffolds were also examined to detect ends that represented alternative assemblies of the same region. Two approaches were taken. First, all scaffolds were compared to all other scaffolds using exonerate, and long alignments of high identity that involved scaffold ends were examined. All such alignments detected involved pairs of scaffold ends that had been placed next to one another on the chromosome by physical mapping. Secondly, adjacent scaffold ends were aligned with Dotter, and the alignments were inspected visually. A final list of 21 scaffold segments on 18 scaffolds considered to be alternative assemblies was prepared by inspection of the exonerate and Dotter alignments (Additional data file 5). All these cases were in regions of chromosome arms 2R, 3L or 3R that were previously proposed to be segregating for distinct haplotypes in the PEST strain <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The range of sequence difference for the final set of aligned sequences was 2.0% to 4.0%. The segment with the higher quality sequence assembly, judged by number of gaps and separation of mapped BAC ends, was retained as part of the AgamP3 chromosome assembly. The lower quality segment was designated as an alternative assembly region (Figure <figr fid="F3">3</figr>). Approximately 3.53 Mbp of sequence were removed from the chromosomes by this approach and 20 gaps in the assembly were closed.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Example of joining scaffolds where adjacent ends are alternative assemblies of the same region</p>
               </caption>
               <text>
                  <p>Example of joining scaffolds where adjacent ends are alternative assemblies of the same region. <b>(a) </b>Using physical mapping techniques, scaffolds AAAB01008904 and AAAB0108851 are placed adjacent to one another on chromosome arm 2R. In the previous genome assembly, MOZ2, the scaffolds were placed with an arbitrary 10 kbp of gap between them. <b>(b) </b>After alignment of scaffolds using Exonerate and Dotter, it was clear that there was about 64 kbp of sequence overlap between the 3' end of AAAB01008904 and the 5' end of AAAB0108851. Based on BAC coverage of each scaffold and gaps in each of the scaffold sequences, we chose to keep the overlapping region from AAAB01008904 (base-pairs 1102797 to 1759265) and use it for the new chromosome assembly. <b>(c) </b>The corresponding overlapping region from AAAB0108851 (base-pairs 1 to 635373) was deemed to be an alternative assembly segment, with the rest of the scaffold kept as part of the chromosome assembly. The regions retained as parts of chromosome arm 2R were placed adjacent to each other with no inter- scaffold gap.</p>
               </text>
               <graphic file="gb-2007-8-1-r5-3"/>
            </fig>
            <p>Initial analyses revealed that some of the unmapped scaffolds harbored sequences with unexpectedly high similarity to bacterial proteins. Thirty-two such scaffolds were tested for their presence in the mosquito genome by PCR amplification of <it>A. gambiae </it>PEST strain genomic DNA from embryos using PCR primer pairs specific to these scaffolds (Additional data file 6). Despite repeated attempts, none of the primer pairs yielded any specific products. The combined evidence of high sequence similarity to bacterial genes and negative PCR results strongly suggested that the bacterial-like sequences constitute a contaminant of the <it>A. gambiae </it>genome assembly, rather than an integral part of the <it>A. gambiae </it>genome (data not shown). To identify all such potential bacterial scaffolds, the entire unmapped scaffold set was compared against NCBI's nr protein database. Scaffolds were identified as bacterial contaminants if they had no high similarity to other <it>A. gambiae </it>scaffolds and top hits against the scaffold were only to bacterial proteins with E values at least five orders of magnitude higher than any hits to proteins from eukaryotic organisms. A set of 679 scaffolds, totaling 1.97 Mbp, matched these criteria and are thus regarded as bacterial (Additional data file 6).</p>
            <p>The revised assembly (AgamP3) has a total of 80 scaffolds assigned to and ordered on the chromosome arms X, 2R, 2L, 3R and 3L (Table <tblr tid="T1">1</tblr>). The 28 scaffolds shown in bold have been newly mapped or oriented. In 10 cases, adjacent scaffolds are bridged by BAC clones that have their ends mapped to the two different scaffolds. In 20 cases adjacent scaffolds have been joined because their ends represent alternative assemblies of the same region; 12 of these joins are also supported by bridging BACs. Thus, three different approaches have proved valuable for improving the assembly of the genome: additional physical mapping, detailed <it>in silico </it>analysis of the scaffold sequences, and further mapping of BAC clone end sequences.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>The result of this work is an improved view of the <it>A. gambiae </it>genome assembly. In the sequencing and assembly phase of the <it>A. gambiae </it>genome project, a significant amount of the heterochromatic DNA was successfully cloned and sequenced <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. However, enrichment of the repetitive DNA in pericentromeric regions limited the initial effort to physically map these regions. Only 9 scaffolds with total size 3.3 Mbp were mapped in pericentromeric regions of chromosomes. Figure <figr fid="F4">4a</figr> compares this updated version of the <it>A. gambiae </it>assembly (AgamP3) with the previous version <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The most significant differences between these two versions are seen in the pericentromeric areas of all chromosomes. The updated version of the genome has 24 scaffolds with a total size of 8.64 Mbp in pericentromeric areas. These results are comparable with data obtained for the <it>D. melanogaster </it>genome. In the first publication of the fruit fly genome, only 3.8 Mbp were mapped to centromeric areas <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Release 3 of the <it>Drosophila </it>whole-genome shotgun sequence assembly (WGS3) significantly extended the assembly into the centric heterochromatin; 20.7 Mbp of sequence was identified as heterochromatic <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. Both <it>Drosophila </it>and <it>Anopheles </it>genome assemblies have 16 large scaffolds with sizes bigger then 250 kbp in the heterochromatic regions of the chromosomes.</p>
         <fig id="F4">
            <title>
               <p>Figure 4</p>
            </title>
            <caption>
               <p>A comparison of the initial and updated versions of the <it>Anopheles gambiae </it>genome assembly</p>
            </caption>
            <text>
               <p>A comparison of the initial and updated versions of the <it>Anopheles gambiae </it>genome assembly.<b>(a) </b>The scaffolds from the previous and updated versions of the genome are shown by gray and pink bars, respectively. Purple stripes on the scaffolds indicate alternative haplotype scaffolds with sizes bigger than 50 kbp. Black bars correspond to the BAC clones that cross inter-scaffold gaps. <b>(b) </b>The updated status of the <it>A. gambiae </it>genome project. Sectors correspond to the previously mapped scaffolds, additionally physically mapped scaffolds, alternative haplotype scaffolds, Y-specific scaffolds, bacterial contaminant scaffolds and the remaining scaffolds that are not assigned to the chromosomes.</p>
            </text>
            <graphic file="gb-2007-8-1-r5-4"/>
         </fig>
         <p>This AgamP3 assembly does not complete any of the centromeric regions on the chromosomes, and it is unclear if any of the scaffolds now mapped to centromeric regions actually include functional centromeric sequences. No large blocks of simple repeats appear in the scaffolds that have been mapped in heterochromatic regions. The amount of short repeats (smaller then 200 bp) in different heterochromatic scaffolds varies from 1% to 34%. The functional approximately 420 kbp <it>Drosophila </it>centromere is composed of large blocks of repeats (350 kbp) and more complex sequence composed of transposable elements <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. The situation is similar in <it>Arabidopsis </it>chromosomes, where the centromeric regions contain tandem 180 bp repeats with a total size of about 0.5 to 3 Mbp, and the surrounding area is enriched in moderate repeats and transposable elements <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. In neither case does the initial genome assembly reach the centromeric region, and special efforts were required for cloning and sequencing the centromeres.</p>
         <p>According to <it>in situ </it>results, the only telomeric region covered by scaffolds in the <it>A. gambiae </it>assembly is on the 2L arm. All three satellite sequences previously described as telomeric <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp> have been identified in this scaffold. The <it>in situ </it>results for the distal most BAC clones in the scaffolds closest to the telomeric regions on the other chromosomal arms showed that they are located several bands from the ends of the chromosomes.</p>
         <p>The gene content in areas around centromeres is comparable between <it>Anopheles</it>, <it>Drosophila </it>and <it>Arabidopsis </it>genomes. Gene density in the <it>Anopheles </it>genome is 5 per 100 kbp in euchromatic scaffolds, 2 per 100 kbp in pericentromeric and 0.2 per 100 kbp in the three most centromeric scaffolds. In the <it>Drosophila </it>genome the gene content is higher in euchromatin at 11 genes per 100 kbp <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> and the same at 2 per 100 kbp around the heterochromatin-euchromatin junction <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. <it>Arabidobsis </it>has an even higher gene content in euchromatic areas of about 25 per 100 kbp, 1.5 in the genetically identified centromeric region and 0.9 in the region enriched in repetitive elements <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. As in <it>Drosophila </it><abbrgrp><abbr bid="B37">37</abbr></abbrgrp> and <it>Arabidopsis </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, the <it>Anopheles </it>genome does not have a sharp boundary between hetero- and euchromatin.</p>
         <p>Figure <figr fid="F4">4a</figr> shows 22 gaps between scaffolds in the <it>A. gambiae </it>genome that can be covered by additional sequencing of BAC clones, which would decrease the number of scaffolds in the genome assembly. The great progress in finishing the <it>Drosophila </it>genome has come as a result of the additional sequencing of overlapping BAC clones, sub-clones and PCR products <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. Release 3 of the <it>Drosophila </it>genome is represented by 13 scaffolds with a total of 37 sequencing gaps in the euchromatic portion of the genome.</p>
         <p>In the initial report of the <it>A. gambiae </it>genomic sequence, Holt <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> described considerable genetic variation within the PEST strain and suggested, partly on the basis of finding regions with very high single nucleotide polymorphism (SNP) density, that the PEST strain continued to segregate into two different haplotypes for certain regions of the genome. These regions would have derived from the divergent Mopti and Savanna chromosomal forms that contributed to the construction of the PEST strain. Thomasova <it>et al</it>. <abbrgrp><abbr bid="B39">39</abbr></abbrgrp> sequenced BAC clones in the <it>Pen1 </it>area of the PEST genome and found a 3.3% sequence difference in a 122 kbp region of BAC clone overlap, suggesting that this polymorphism in PEST was not simply an artifact of assembling a highly polymorphic colony. This study has identified 141 distinct scaffolds that probably represent alternative assemblies for regions totaling 8.2 Mbp, and an additional 3.5 Mbp previously mapped to chromosomes. Figure <figr fid="F4">4a</figr> shows the location of the alternative haplotype scaffolds with sizes bigger than 50 kbp. Adjacent pairs of scaffolds that have overlapping alternative assemblies on their ends are shown as single scaffolds on the picture of the new <it>A. gambiae </it>assembly.</p>
         <p>It remains possible that some of the sequences designated as alternative haplotype assemblies are actually real duplications. However, the regions identified overlap with those previously found to have high SNP densities, and the alternative assemblies for a region differed in sequence by between 1.2% and 4.6%, similar to the previously reported difference found from BAC sequencing <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. The identification of scaffolds that represent alternative assemblies enables duplicates to be removed from the set of scaffolds making up the genomic assembly and enables the elimination of artifactual genes from the predicted <it>A. gambiae </it>gene set. It will also facilitate initial studies of both non-coding and gene allelic differences between the two contributing chromosomal forms. It is important to note, however, that the two alternative assemblies of a region are unlikely to accurately represent the two alternative 'haplotypes' that may have been segregating in the PEST strain. Instead, the assembly process may produce two assemblies, both of which are a mosaic of the two haplotypes. Additional scaffolds or scaffold regions that represent alternative assemblies may still be present within the set described here as the revised genomic assembly AgamP3. In this study, 15 kbp was selected as the shortest alignment that could reliably be classified as two alternative assemblies, reasoning that smaller alignments could represent different transposons. In addition, some polymorphic regions may have been assembled as artifactual tandem duplications within a single scaffold <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>; this study has not attempted to eliminate such regions.</p>
         <p>The <it>A. gambiae </it>genomic sequences are expected to contain some level of contamination from bacteria, particularly from those found in the gut <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Currently, 679 scaffolds have been identified as apparently bacterial. However, the actual number of bacterial scaffolds within the <it>A. gambiae </it>assembly may be larger. The selected list includes only scaffolds with BLAST hits to bacterial sequences having a cutoff value of E = 10<sup>-15</sup>. It is likely that some scaffolds with smaller sequence similarity to bacterial sequences currently available in GenBank also have bacterial origin.</p>
         <p>Moreover, the assembly refinements described in this paper have a direct impact on the predicted genome wide set of genes. The most recent gene set based on the previous assembly <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> included 422 gene predictions on scaffolds or scaffold segments now classified as alternative assemblies. The scaffolds now designated as bacterial contaminants had 328 gene predictions already marked as of likely bacterial origin, and an additional 522 not so marked. Inspection of these showed that many had domains suggesting a likely bacterial origin, and none were unequivocally eukaryotic. Hence the first gene set based on the new assembly <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> benefits from the removal of some duplicate predictions (artifactual paralogues) for genes represented in two alternative assemblies of the same chromosome region and from the absence of predictions derived from bacterial contaminants.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Use of cDNA and BAC clones and PCR amplified gene-fragments as probes for <it>in situ </it>hybridization and additional <it>in silico </it>analysis of the scaffold sequences have led to an overall improvement of the <it>A. gambiae </it>genome assembly. A total of about 15 Mbp has been added to the mapped part of the genome, about 2 Mbp has been removed as bacterial specific and about 12 Mbp has been reclassified as probable alternative assemblies (Figure <figr fid="F4">4b</figr>). One-third of the previously unmapped portion of the <it>A. gambiae </it>genome has been assigned to a chromosomal location. Removal of the probable bacterial and alternative assembly scaffolds has reduced the genome from the original total of 278 Mbp in the Moz2 assembly to 264 Mbp (without gaps), which is much closer to the C<sub>o</sub>T estimate of 260 Mbp of Besansky and Powell <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. Moreover, even this new genome size estimate is likely to be somewhat inflated because of residual, small haplotype and bacterial contamination scaffolds. While the AgamP3 assembly can clearly be improved by sequencing BAC clones that cross inter-scaffold gaps or a careful analysis of mate pair violations among the genomic DNA clones sequenced in the original genome project, the upcoming genome projects for the <it>A. gambiae </it>S and M molecular forms <abbrgrp><abbr bid="B42">42</abbr></abbrgrp> is almost certain to produce a significantly improved assembly for one or both of these two new <it>A. gambiae </it>genomes.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>In situ hybridization</p>
            </st>
            <p>Three different types of probe were used for <it>in situ </it>hybridization: cDNA clones, BAC clones and PCR amplified genes. The cDNA clones were selected from the ANGNAP1 library using BLASTN searching against each of the unmapped and unoriented scaffolds. A pair of cDNA clones from genes on opposite ends of the scaffold and each with a single location in the genome were considered as the candidates for <it>in situ </it>hybridization. BAC clones from ND-1 <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> or ND-Tam <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> libraries and PCR amplified gene-fragments were identified within the scaffolds using the mappings displayed in the VectorBase genome viewer <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. Primer design for gene-fragment amplification was done using the Primer 3 program. The cDNA probes were prepared using the FastPlasmid Mini kit (Eppendorf, Hamburg, Germany) and BAC clone DNA was isolated by standard procedures <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. Both types of probe were labeled with Cy3-AP3-dUTP or Cy5-AP3-dUTP (GE Healthcare UK Ltd (formerly Amersham Biosciences Corp.), Little Chalfont, Buckinghamshire, UK) using the GIBCO BRL Nick Translation labeling system. Amplified gene-fragment DNA was prepared by standard PCR amplification procedures <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. To prevent non-specific amplification we utilized an appropriate BAC clone DNA as a template for PCR. PCR product was labeled with Cy3-AP3-dUTP or Cy5-AP3-dUTP (GE Healthcare UK Ltd) using the Random Primers DNA Labeling System (Invitrogen, Karlsruht, Germany). To obtain polytene chromosome preparations, ovaries from the SUA strain at the appropriate stage were dissected into fresh Carnoy's solution (ethanol: glacial acetic acid, 3:1). Ovaries were gently pressed with a cover slip in 50% propionic acid, dipped in liquid nitrogen, then cover slips were removed and slides were dehydrated in 50%, 70%, 95%, and 100% ethanol. DNA probes were hybridized to the chromosomes by standard procedures <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> and then chromosomes were washed in 0.2XSSC, counterstained with YOYO-1 and mounted in DABCO <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Fluorescent signals were detected using a Bio-Rad MRC 1024 Scanning Confocal System (Bio-Rad Laboratories, Hercules, CA, USA).</p>
         </sec>
         <sec>
            <st>
               <p>Estimation of gene, transposable element and short repeat density in scaffolds</p>
            </st>
            <p>The genomic sequences of <it>A. gambiae </it>scaffolds were downloaded from the VectorBase website <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. For the estimation of gene density in scaffolds, 12,600 assembled expressed sequence tags from a Normalized Head library, Normalized Fat Body library and a pooled library were placed on scaffolds using BLAST, requiring an E value of &lt;10<sup>-20</sup>. For the analysis of the simple repeat content with sizes smaller then 200 bp, we used Tandem Repeats Finder <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. The percentage of sequences corresponding to the known transposable elements was found using the RepeatMasker program <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. Two custom databases were used for the search: the database of the <it>A. gambiae </it>transposable elements (Maria Sharakhova and Frank Collins, unpublished data) and the database of natural transposable element sequences identified in <it>D. melanogaster </it>by M Ashburner <it>et al</it>. <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Identification of the Y-specific scaffolds</p>
            </st>
            <p>Scaffolds containing Y chromosome-specific satellite DNA families were regarded as Y-linked. They were identified using randomly selected monomer sequence of each of the four Y-specific satellite DNA families <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> as a query in local BLASTN searches against the <it>A. gambiae </it>genome database.</p>
         </sec>
         <sec>
            <st>
               <p>Finding BAC clones bridging physical gaps between scaffolds</p>
            </st>
            <p>To identify BAC clones spanning gaps, BLASTN was utilized. <it>A. gambiae </it>BESs that demonstrated significant similarity (E &lt; 10<sup>-50</sup>) to scaffolds on either side of gaps were selected. From this pool, BACs were identified as crossing gaps if paired BESs fulfilled the following criteria: matching the <it>A. gambiae </it>genomic sequence with an E value of &lt;10<sup>-75</sup>, having the appropriate relative orientation, and preferably not being repeated on the scaffolds. The sequence length between the BES and the gap was then determined to identify the shortest BAC that crossed the gap, if more than one was identified, using the VectorBase genome viewer.</p>
         </sec>
         <sec>
            <st>
               <p>Identification of polymorphic and bacterial contaminant scaffolds</p>
            </st>
            <p>To detect polymorphic scaffolds in the <it>A. gambiae </it>genome, unmapped scaffolds greater than 15 kbp in length were mapped to the release 2 chromosome assemblies using the program exonerate <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Alignment was carried out with exonerate's non-equivalence region (NER) model, which permits alignments to be carried on across the gaps within scaffolds and across possible insertions and deletions. Scaffolds were identified as representing putative alternative assemblies for a region if >97% of the unmapped scaffold was aligned to a chromosome region (where this figure includes any segments treated as non-equivalence regions) and there was >95% sequence identity in the aligned segments. A similar approach was used to identify ends of mapped scaffolds that might represent alternative assemblies of the same region. Selected scaffolds were also aligned and examined using Dotter <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. Scaffolds were identified as derived from contaminating bacterial DNA if they were unmapped and if the scaffolds appeared to encode only proteins of prokaryotic origin. Scaffold sequences were compared with all proteins in GenBank using BLASTX and were designated as bacterial if they had a hit to a prokaryotic protein with an E value that was &lt;10<sup>-15 </sup>and 5 orders of magnitude lower than that of the best hit to a eukaryotic protein. To assess the risk of false positive results, we took scaffolds previously mapped to the <it>A. gambiae </it>chromosome arms, broke them into pieces of size equal to the average length of all putative contaminant scaffolds, and then searched them for prokaryotic-like proteins in the same manner. None of the mapped scaffolds would have been designated as bacterial by this procedure.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>The following additional data are available with the online version of this paper. Additional data file <supplr sid="S1">1</supplr> contains a table listing scaffolds mapped to chromosomes. Additional data file <supplr sid="S2">2</supplr> contains a table listing Y chromosome scaffolds. Additional data file <supplr sid="S3">3</supplr> contains a table listing BAC clones that span scaffold gaps. Additional data file <supplr sid="S4">4</supplr> contains a table containing a list of alternative assemblies. Additional data file <supplr sid="S5">5</supplr> contains a table that lists segments of joined scaffolds that represent alternative assemblies of adjacent mapped scaffolds. Additional data file <supplr sid="S6">6</supplr> contains a table that lists bacterial specific scaffolds.</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>Scaffolds additionally mapped to the chromosomes</p>
            </caption>
            <text>
               <p>Scaffold AAAB01008882 was not included in the final 2L chromosome assembly because of concerns about possible mis-assembly of part of the scaffold.</p>
            </text>
            <file name="gb-2007-8-1-r5-S1.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional data file 2</p>
            </title>
            <caption>
               <p>Scaffolds mapped to the Y chromosome</p>
            </caption>
            <text>
               <p>Scaffolds mapped to the Y chromosome.</p>
            </text>
            <file name="gb-2007-8-1-r5-S2.doc">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S3">
            <title>
               <p>Additional data file 3</p>
            </title>
            <caption>
               <p>BAC clones that span gaps between scaffolds</p>
            </caption>
            <text>
               <p>BAC clones that span gaps between scaffolds.</p>
            </text>
            <file name="gb-2007-8-1-r5-S3.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S4">
            <title>
               <p>Additional data file 4</p>
            </title>
            <caption>
               <p>Scaffolds representing alternative assemblies</p>
            </caption>
            <text>
               <p>Scaffolds representing alternative assemblies.</p>
            </text>
            <file name="gb-2007-8-1-r5-S4.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S5">
            <title>
               <p>Additional data file 5</p>
            </title>
            <caption>
               <p>Segments of the joined scaffolds that represent alternative assemblies of the adjacent mapped scaffolds</p>
            </caption>
            <text>
               <p>Chr: the chromosome that the scaffolds are mapped to. Alt_Scaffold: the scaffold with sequence overlap considered to be an alternative assembly to the golden path. Chr_Scaffold: the scaffold with sequence overlap used for the golden path. as_start: start of the overlap on the scaffold with the alternative assembly segment (Alt_Scaffold). as_end: end of the overlap on the Alt_Scaffold. cs_start: start of the overlap on the scaffold where the overlap segment is used for the golden path (Chr_Scaffold). cs_end: end of the overlap on the Chr_Scaffold. Chr_Start: start of the overlap region, in chromosomal coordinates. Chr_End: end of the overlap region, in chromosomal coordinates. as_to_cs: orientation of the Alt_Scaffold with respect to the Chr_Scaffold. as_to_chr: orientation of the Alt_Scaffold with respect to the chromosome.</p>
            </text>
            <file name="gb-2007-8-1-r5-S5.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S6">
            <title>
               <p>Additional data file 6</p>
            </title>
            <caption>
               <p>Bacterial specific scaffolds</p>
            </caption>
            <text>
               <p>Bacterial specific scaffolds.</p>
            </text>
            <file name="gb-2007-8-1-r5-S6.doc">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Marcelo Bento Soares (University of Iowa) for helping us with construction and normalization of the ANGNAP1 cDNA library. This work was supported by NIAID cooperative agreement U01 AI 48846 and by the NIAID VectorBase Bioinformatics Resource Center contract number HHSN 266200400039C to FHC.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>The genome sequence of the malaria mosquito <it>Anopheles gambiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Holt</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Subramanian</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Halpern</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Charlab</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Nusskern</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Wincker</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Ribeiro</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Wides</snm>
                  <fnm>R</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>298</volume>
            <fpage>129</fpage>
            <lpage>149</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1076181</pubid>
                  <pubid idtype="pmpid" link="fulltext">12364791</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>The <it>Anopheles gambiae </it>genome: an update.</p>
            </title>
            <aug>
               <au>
                  <snm>Mongin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Louis</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Holt</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>FH</fnm>
               </au>
            </aug>
            <source>Trends Parasitol</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>49</fpage>
            <lpage>52</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.pt.2003.11.003</pubid>
                  <pubid idtype="pmpid" link="fulltext">14747013</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>From the biology of heterochromatin.</p>
            </title>
            <aug>
               <au>
                  <snm>John</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Heterochromatin: Molecular and Structural Aspects</source>
            <publisher>Cambridge: Cambridge University Press</publisher>
            <editor>Verma RS</editor>
            <pubdate>1988</pubdate>
            <fpage>1</fpage>
            <lpage>147</lpage>
         </bibl>
         <bibl id="B4">
            <title>
               <p>The beta heterochromatic sequences flanking the I elements are themselves defective transposable elements.</p>
            </title>
            <aug>
               <au>
                  <snm>Vaury</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bucheton</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pelisson</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Chromosoma</source>
            <pubdate>1989</pubdate>
            <volume>98</volume>
            <fpage>215</fpage>
            <lpage>224</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/BF00329686</pubid>
                  <pubid idtype="pmpid">2555116</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Heterochromatin and gene expression in <it>Drosophila</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Weiler</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Wakimoto</snm>
                  <fnm>BT</fnm>
               </au>
            </aug>
            <source>Annu Rev Genet</source>
            <pubdate>1995</pubdate>
            <volume>29</volume>
            <fpage>577</fpage>
            <lpage>605</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.ge.29.120195.003045</pubid>
                  <pubid idtype="pmpid" link="fulltext">8825487</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Uber &#945;- und &#946;-Heterochromatin sowie Konstanz und Bau der Chromomeren bei <it>Drosophila</it></p>
            </title>
            <aug>
               <au>
                  <snm>Heitz</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Biol Zentbl</source>
            <pubdate>1934</pubdate>
            <volume>54</volume>
            <fpage>588</fpage>
            <lpage>609</lpage>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Polytene chromosomes, heterochromatin, and position effect variegation.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhimulev</snm>
                  <fnm>IF</fnm>
               </au>
            </aug>
            <source>Adv Genet</source>
            <pubdate>1998</pubdate>
            <volume>37</volume>
            <fpage>1</fpage>
            <lpage>566</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9352629</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Requirement of heterochromatin for cohesion at centromeres.</p>
            </title>
            <aug>
               <au>
                  <snm>Bernard</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Maure</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Partridge</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Genier</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Javerzat</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Allshire</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>294</volume>
            <fpage>2539</fpage>
            <lpage>2542</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1064027</pubid>
                  <pubid idtype="pmpid" link="fulltext">11598266</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Direct evidence of a role for heterochromatin in meiotic chromosome segregation.</p>
            </title>
            <aug>
               <au>
                  <snm>Dernburg</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Sedat</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Hawley</snm>
                  <fnm>RS</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1996</pubdate>
            <volume>86</volume>
            <fpage>135</fpage>
            <lpage>146</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(00)80084-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">8689681</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Nuclear dynamics: where genes are and how they got there.</p>
            </title>
            <aug>
               <au>
                  <snm>Swedylow</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Lamond</snm>
                  <fnm>AI</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <fpage>REVIEWS0002</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">138913</pubid>
                  <pubid idtype="pmpid" link="fulltext">11276427</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Order and disorder in the nucleus.</p>
            </title>
            <aug>
               <au>
                  <snm>Marshall</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>R185</fpage>
            <lpage>192</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(02)00724-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">11882311</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Heterochromatin function in complex genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Henikoff</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Biochim Biophys Acta</source>
            <pubdate>2000</pubdate>
            <volume>1470</volume>
            <fpage>O1</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10656988</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Heterochromatin: new possibilities for the inheritance of structure.</p>
            </title>
            <aug>
               <au>
                  <snm>Grewal</snm>
                  <fnm>SI</fnm>
               </au>
               <au>
                  <snm>Elgin</snm>
                  <fnm>SC</fnm>
               </au>
            </aug>
            <source>Curr Opin Genet Dev</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>178</fpage>
            <lpage>187</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-437X(02)00284-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">11893491</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>A BAC-based physical map of the major autosomes of <it>Drosophila melanogaster</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Hoskins</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Berman</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Laverty</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>George</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Ciesiolka</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Naeemuddin</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Arenson</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>David</snm>
                  <fnm>RG</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>287</volume>
            <fpage>2271</fpage>
            <lpage>2274</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.287.5461.2271</pubid>
                  <pubid idtype="pmpid" link="fulltext">10731150</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Y chromosome and other heterochromatic sequences of the <it>Drosophila melanogaster </it>genome: how far can we go?</p>
            </title>
            <aug>
               <au>
                  <snm>Carvalho</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Vibranovski</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Carlson</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Celniker</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Hoskins</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>AG</fnm>
               </au>
            </aug>
            <source>Genetica</source>
            <pubdate>2003</pubdate>
            <volume>117</volume>
            <fpage>227</fpage>
            <lpage>237</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1023/A:1022900313650</pubid>
                  <pubid idtype="pmpid" link="fulltext">12723702</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Genetic definition and sequence analysis of <it>Arabidopsis </it>centromeres.</p>
            </title>
            <aug>
               <au>
                  <snm>Copenhaver</snm>
                  <fnm>GP</fnm>
               </au>
               <au>
                  <snm>Nickel</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kuromori</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Benito</snm>
                  <fnm>MI</fnm>
               </au>
               <au>
                  <snm>Kaul</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Bevan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Murphy</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Parnell</snm>
                  <fnm>LD</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <fpage>2468</fpage>
            <lpage>2474</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.286.5449.2468</pubid>
                  <pubid idtype="pmpid" link="fulltext">10617454</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Using <it>Arabidopsis </it>to understand centromere function: progress and prospects.</p>
            </title>
            <aug>
               <au>
                  <snm>Copenhaver</snm>
                  <fnm>GP</fnm>
               </au>
            </aug>
            <source>Chromosome Res</source>
            <pubdate>2003</pubdate>
            <volume>11</volume>
            <fpage>255</fpage>
            <lpage>262</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1023/A:1022887926807</pubid>
                  <pubid idtype="pmpid" link="fulltext">12769292</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Lessons from the human genome: transitions between euchromatin and heterochromatin.</p>
            </title>
            <aug>
               <au>
                  <snm>Horvath</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Bailey</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Locke</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Eichler</snm>
                  <fnm>EE</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2001</pubdate>
            <volume>10</volume>
            <fpage>2215</fpage>
            <lpage>2223</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/10.20.2215</pubid>
                  <pubid idtype="pmpid" link="fulltext">11673404</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>The structure and evolution of centromeric transition regions within the human genome.</p>
            </title>
            <aug>
               <au>
                  <snm>She</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Horvath</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Jiang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Christ</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Graves</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Gulden</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Alkan</snm>
                  <fnm>C</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>430</volume>
            <fpage>857</fpage>
            <lpage>864</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02806</pubid>
                  <pubid idtype="pmpid" link="fulltext">15318213</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>The genome sequence of <it>Drosophila melanogaster</it></p>
            </title>
            <aug>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Celniker</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Holt</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Evans</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Gocayne</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Amanatides</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Hoskins</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Galle</snm>
                  <fnm>RF</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>287</volume>
            <fpage>2185</fpage>
            <lpage>2195</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.287.5461.2185</pubid>
                  <pubid idtype="pmpid" link="fulltext">10731132</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Reassociation kinetics of <it>Anopheles gambiae </it>(<it>Diptera: Culicidae</it>) DNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Besansky</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Powell</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>J Med Entomol</source>
            <pubdate>1992</pubdate>
            <volume>29</volume>
            <fpage>125</fpage>
            <lpage>128</lpage>
            <xrefbib>
               <pubid idtype="pmpid">1552521</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>From mosquito genomes: structure, organization, and evolution.</p>
            </title>
            <aug>
               <au>
                  <snm>Rai</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Black</snm>
                  <fnm>WC</fnm>
                  <suf>IV</suf>
               </au>
            </aug>
            <source>Advances in Genetics</source>
            <publisher>San Diego: Academic Press, a Harcourt Science and Technology Company</publisher>
            <editor>Hall JC, Dunlap JC, Friedmann T, Giannell F</editor>
            <pubdate>1999</pubdate>
            <volume>41</volume>
            <fpage>2</fpage>
            <lpage>33</lpage>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Isolation and characterization of Y chromosome sequences from the African malaria mosquito <it>Anopheles gambiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Krzywinski</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Nusskern</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Kern</snm>
                  <fnm>MK</fnm>
               </au>
               <au>
                  <snm>Besansky</snm>
                  <fnm>NJ</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2004</pubdate>
            <volume>166</volume>
            <fpage>1291</fpage>
            <lpage>1302</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1470776</pubid>
                  <pubid idtype="pmpid" link="fulltext">15082548</pubid>
                  <pubid idtype="doi">10.1534/genetics.166.3.1291</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Satellite DNA from the Y chromosome of the malaria vector <it>Anopheles gambiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Krzywinski</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sangare</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Besansky</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2005</pubdate>
            <volume>169</volume>
            <fpage>185</fpage>
            <lpage>196</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1448884</pubid>
                  <pubid idtype="pmpid" link="fulltext">15466420</pubid>
                  <pubid idtype="doi">10.1534/genetics.104.034264</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Gene finding on the Y: fruitful strategy in <it>Drosophila </it>does not deliver in <it>Anopheles</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Krzywinski</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chrystal</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Besansky</snm>
                  <fnm>NJ</fnm>
               </au>
            </aug>
            <source>Genetica</source>
            <pubdate>2006</pubdate>
            <volume>126</volume>
            <fpage>369</fpage>
            <lpage>375</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s10709-005-1985-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">16636930</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Construction of a BAC library and generation of BAC end sequence-tagged connectors for genome sequencing of the African malaria mosquito <it>Anopheles gambiae</it></p>
            </title>
            <aug>
               <au>
                  <snm>Hong</snm>
                  <fnm>YS</fnm>
               </au>
               <au>
                  <snm>Hogan</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Sarkar</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sim</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Loftus</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Ren</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Huff</snm>
                  <fnm>ER</fnm>
               </au>
               <au>
                  <snm>Carlile</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Black</snm>
                  <fnm>K</fnm>
               </au>
               <etal/>
            </aug>
            <source>Mol Genet Genomics</source>
            <pubdate>2003</pubdate>
            <volume>268</volume>
            <fpage>720</fpage>
            <lpage>728</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12655398</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Chromosomal differentiation and adaptation to human environments in the <it>Anopheles gambiae </it>complex.</p>
            </title>
            <aug>
               <au>
                  <snm>Coluzzi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sabatini</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Petrarca</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Di Deco</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Trans R Soc Trop Med Hyg</source>
            <pubdate>1979</pubdate>
            <volume>73</volume>
            <fpage>483</fpage>
            <lpage>497</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0035-9203(79)90036-1</pubid>
                  <pubid idtype="pmpid">394408</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>A polytene chromosome analysis of the <it>Anopheles gambiae </it>species complex.</p>
            </title>
            <aug>
               <au>
                  <snm>Coluzzi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sabatini</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Della Torre</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Di Deco</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Petrarca</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>298</volume>
            <fpage>1415</fpage>
            <lpage>1418</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1077769</pubid>
                  <pubid idtype="pmpid" link="fulltext">12364623</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>The distribution and inversion polymorphism of chromosomally recognized taxa of the <it>Anopheles gambiae </it>complex in Mali, West Africa.</p>
            </title>
            <aug>
               <au>
                  <snm>Toure</snm>
                  <fnm>YT</fnm>
               </au>
               <au>
                  <snm>Petrarca</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Traore</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Coulibaly</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Maiga</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Sankare</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Sow</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Di Deco</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Coluzzi</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Parassitologia</source>
            <pubdate>1998</pubdate>
            <volume>40</volume>
            <fpage>477</fpage>
            <lpage>511</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10645562</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Molecular evidence of incipient speciation within <it>Anopheles gambiae </it>s.s. in West Africa.</p>
            </title>
            <aug>
               <au>
                  <snm>della Torre</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Fanello</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Akogbeto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dossou-yovo</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Favia</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Petrarca</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Coluzzi</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Insect Mol Biol</source>
            <pubdate>2001</pubdate>
            <volume>10</volume>
            <fpage>9</fpage>
            <lpage>18</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2583.2001.00235.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">11240632</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>On the distribution and genetic differentiation of <it>Anopheles gambiae </it>s.s. molecular forms.</p>
            </title>
            <aug>
               <au>
                  <snm>della Torre</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Petrarca</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Insect Biochem Mol Biol</source>
            <pubdate>2005</pubdate>
            <volume>35</volume>
            <fpage>755</fpage>
            <lpage>769</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.ibmb.2005.02.006</pubid>
                  <pubid idtype="pmpid" link="fulltext">15894192</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Automated generation of heuristics for biological sequence comparison.</p>
            </title>
            <aug>
               <au>
                  <snm>Slater</snm>
                  <fnm>GS</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>31</fpage>
            <lpage/>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.3767105</pubid>
                  <pubid idtype="pmpid" link="fulltext">16339359</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Heterochromatic sequences in a <it>Drosophila </it>whole-genome shotgun assembly.</p>
            </title>
            <aug>
               <au>
                  <snm>Hoskins</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Carlson</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Carvalho</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Halpern</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kaminker</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Kennedy</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Sullivan</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>GG</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>reasearch0085.1</fpage>
            <lpage>0085.16</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1186/gb-2002-3-12-research0085</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Sequence analysis of a functional <it>Drosophila </it>centromere.</p>
            </title>
            <aug>
               <au>
                  <snm>Sun</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Le</snm>
                  <fnm>HD</fnm>
               </au>
               <au>
                  <snm>Wahlstrom</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Karpen</snm>
                  <fnm>GH</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>182</fpage>
            <lpage>194</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">420369</pubid>
                  <pubid idtype="pmpid" link="fulltext">12566396</pubid>
                  <pubid idtype="doi">10.1101/gr.681703</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Molecularcharacterization of the <it>Anopheles gambiae </it>2L telomeric region via an integrated transgene.</p>
            </title>
            <aug>
               <au>
                  <snm>Biessmann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Donath</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Walter</snm>
                  <fnm>MF</fnm>
               </au>
            </aug>
            <source>Insect Mol Biol</source>
            <pubdate>1996</pubdate>
            <volume>5</volume>
            <fpage>11</fpage>
            <lpage>20</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8630530</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>DNA organization and length polymorphism at the 2L telomeric region of <it>Anopheles gambiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Biessmann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kobeski</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Walter</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Kasravi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Roth</snm>
                  <fnm>CW</fnm>
               </au>
            </aug>
            <source>Insect Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>7</volume>
            <fpage>83</fpage>
            <lpage>93</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2583.1998.71054.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">9459432</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Genetic and bioinformatic analysis of 41C and the 2R heterochromatin of <it>Drosophila melanogaster </it>: a window on the heterochromatin-euchromatin junction.</p>
            </title>
            <aug>
               <au>
                  <snm>Myster</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Cavallo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Christian</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Bhotika</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Anderson</snm>
                  <fnm>CT</fnm>
               </au>
               <au>
                  <snm>Peifer</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2004</pubdate>
            <volume>166</volume>
            <fpage>807</fpage>
            <lpage>822</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1470754</pubid>
                  <pubid idtype="pmpid" link="fulltext">15020470</pubid>
                  <pubid idtype="doi">10.1534/genetics.166.2.807</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Finishing a whole-genome shotgun: release 3 of the <it>Drosophila melanogaster </it>euchromatic genome sequence.</p>
            </title>
            <aug>
               <au>
                  <snm>Celniker</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Kronmiller</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Carlson</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Halpern</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Patel</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Champe</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dugan</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Frise</snm>
                  <fnm>E</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>research0079.1</fpage>
            <lpage>0079.14</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1186/gb-2002-3-12-research0079</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Comparative genomic analysis in the region of a major <it>Plasmodium </it>-refractoriness locus of <it>Anopheles gambiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Thomasova</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ton</snm>
                  <fnm>LQ</fnm>
               </au>
               <au>
                  <snm>Copley</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Zdobnov</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Hong</snm>
                  <fnm>YS</fnm>
               </au>
               <au>
                  <snm>Sim</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kafatos</snm>
                  <fnm>FC</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>FH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>8179</fpage>
            <lpage>8184</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">123041</pubid>
                  <pubid idtype="pmpid" link="fulltext">12060762</pubid>
                  <pubid idtype="doi">10.1073/pnas.082235599</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Ensembl v20.2b.1 (1 April 2004)</p>
            </title>
            <url>http://ensembl.lcb.uu.se:8080/Anopheles_gambiae/whatsnew/v20_2b_1.html</url>
         </bibl>
         <bibl id="B41">
            <title>
               <p>'VectorBase' Database (Ensembl release v37.3.1)</p>
            </title>
            <url>http://www.vectorbase.org</url>
         </bibl>
         <bibl id="B42">
            <title>
               <p>National Human Genome Research Institute: NIH NewsRelease</p>
            </title>
            <url>http://www.genome.gov/15014493</url>
         </bibl>
         <bibl id="B43">
            <aug>
               <au>
                  <snm>Sambrook</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Frisch</snm>
                  <fnm>EF</fnm>
               </au>
               <au>
                  <snm>Maniatis</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Molecular Cloning: A Laboratory Manual</source>
            <publisher>New York: Cold Spring Harbor Laboratory Press</publisher>
            <edition>2</edition>
            <pubdate>1989</pubdate>
         </bibl>
         <bibl id="B44">
            <title>
               <p>A technique for nucleic acid <it>in situ </it>hybridization to polytene chr of mosquitoes in the <it>Anopheles gambiae </it>complex.</p>
            </title>
            <aug>
               <au>
                  <snm>Kumar</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>FH</fnm>
               </au>
            </aug>
            <source>Insect Mol Biol</source>
            <pubdate>1994</pubdate>
            <volume>3</volume>
            <fpage>41</fpage>
            <lpage>47</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8069415</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Inversions and gene order shuffling in <it>Anopheles gambiae </it>and <it>A. funestus</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Sharakhov</snm>
                  <fnm>IV</fnm>
               </au>
               <au>
                  <snm>Serazin</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Grushko</snm>
                  <fnm>OG</fnm>
               </au>
               <au>
                  <snm>Dana</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lobo</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Hillenmeyer</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Westerman</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Romero-Severson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Costantini</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Sagnon</snm>
                  <fnm>N</fnm>
               </au>
               <etal/>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>298</volume>
            <fpage>182</fpage>
            <lpage>185</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1076803</pubid>
                  <pubid idtype="pmpid" link="fulltext">12364797</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Tandem Repeats Finder</p>
            </title>
            <url>http://tandem.bu.edu/trf/trf.html</url>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Institute for System Biology: RepeatMasker</p>
            </title>
            <url>http://www.repeatmasker.org</url>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Berkeley Drosophila Genome Project</p>
            </title>
            <url>http://www.fruitfly.org</url>
         </bibl>
         <bibl id="B49">
            <title>
               <p>A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Sonnhammer</snm>
                  <fnm>ELL</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>1995</pubdate>
            <volume>167</volume>
            <fpage>GC1</fpage>
            <lpage>GC10</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0378-1119(95)00714-8</pubid>
                  <pubid idtype="pmpid">8566757</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
