For three platforms, reads longer than the commercially available, and/or from not-yet released instruments, have become accessible online. With online, I mean that we all can download these data to have a look at:
1) MiSeq 2x 150 bases runs
As part of the German E. Coli (EHEC) ‘Crowdsourcing Project’, Illumina sequenced fie strains for the UK Health Protection Agency, the fastq files can be downloaded from http://www.hpa-bioinformatics.org.uk/lgp/genomes. These are the first data in the public domain from a MiSeq!
See also this post on GenomeWeb.
2) IonTorrent 316 chip
Keith Robison shares a bit of info on data from an Ion 316 chip ion his ‘Omics! Omic!’ blog: “1.69M reads, with 1.53M of those >=50 bp long and 1.07M 100bp or longer”:
I downloaded the run files, and quickly looked at the read length distribution of the trimmed reads in the sff file (which listed 260 flows, 40 more than the file I analyzed in my previous post), showing a peak exactly one base longer at 109 bases. So, many more reads but not much gain in length (yet). Note the strange shape of the peak:
3) 454 GS FLX+
As part of the assemblathon2 (a de novo assembly competition), there have been released the first GS FLX+ reads (from a parrot), peak read length around 736 bases: http://bioshare.bioinformatics.ucdavis.edu/Data/hcbxz0i7kg/Parrot/. Those are at Sanger read length, now!
Now I need to find the time to have a look at these data!