There are three types of reads. You can tell the type of the library by the prefix of file names

  1. Short plasmids (2kb): golw.*
  2. Short plasmids (8kb): golws.*
  3. Fosmids (40kb): golwf.*

Note that BAC ends are not available here (see the download site at the medaka genome browser).

File types

There are basically five file types. You can tell the file type by the suffix of file names. Those files are generated by the phredPhrap pipeline. Note that we did not use reads masked by CrossMatch.

  1. Raw reads: *.fasta.nomask.gz (output by phred)
  2. Raw reads (masked by CrossMatch): *.fasta.screen.gz
  3. Quality values of the raw reads above (masked by CrossMatch): *.fasta.screen.qual.gz
  4. High quality reads (masked by CrossMatch, filtered by phredPhrap): *.hq.fasta.screen.gz
  5. Quality values of the high quality reads agove (masked by CrossMatch, filtered by phredPhrap): *.hq.fasta.screen.qual.gz

Alternatively, you can download an XML file that contains all the reads used in the final assembly. This file was directly used for the final assembly so it should not miss any reads (this file is provided just in case you think any reads are missing.).

  1. golw_build4_ver5.allreads.xml.gz

The MD5/SHA256 checksum of the files is also provided (see *sum.txt below).