As before, full run throughput in gigabases (billion bases) is plotted against single-end read length for the different sequencing platforms, both on a log scale:
A new visualisation
Inspired by GapMinderWorld, a fascinating interactive visualisation of demographic data, and using their recommendations, I created an interactive ‘Motion Chart’ version of this visualisation on a Google spreadsheet. The chart allows to track the metrics throughout the years. Here is the final graph after running though all the data:
The X and Y axis are the same as for the first figure, the size of the data points are correlated with the number of reads per run.
You can explore the data interactively yourself by clicking on the graph.
Unfortunately, I didn’t find a way to change the default chart settings, so in order to have the best experience, make sure to adjust the chart as depicted below:
Note that the log scaling does not work as well as in the static picture at the top of this post. This is the reason I did not include the datapoint for Sanger sequencing. If someone can make this graph work in, say python, I’d be happy to include your results!
Notable changes from the October 2013 edition
- I use numbers for the full run output. It was pointed out to me that this was not the case for the PacBio data, where I so far used the metrics for single SMRTCells (‘chips’) only. I have now chosen to report metrics for 12 SMRTCells as a full PacBio run, a compromise between the 8, 12 or 16 SMRTCells per run we have worked with
- PacBio also upgraded to P5-C3 metrics
- the Roche/454 GS Junior upgraded the read length to 700 bp (‘GS Junior+’)
- the Illumina HiSeq2500 ‘1TB’ upgrade (2 × 125 bp read length and 4 billion reads per run)
- I added the Illumina NextSeq 500 and HiSeq X (I chose the output for 1 instrument, even though one has to buy at least 10 of them)
- note how close together the data points fall for the GAII, HiSeq ‘Rapid Run’ mode and NextSeq 500.
- as mentioned in the original blog post: some data was obtained by going to previous versions of company websites through the Internet Archive
- I used full single-run specs with maximally stated throughput as available at the time of writing
- sometimes, the total numbers of reads per full run and total bases obtained do not match up; for the figure, I always chose the reported throughput in bases
- for Illumina, I chose to use the single-end read length, although the maximum throughput was based on the sum of all reads from a paired end run; I felt it unfair to double the read length for this platform for the figure
- no changes for the 454 GS FLX+, Illumina GAII, HISeq ‘Rapid Run’ mode, SOLiD, Ion Torrent PGM and Proton
- Oxford Nanopore’s MinION was not added as the instrument is not yet full commercially available – they are still in early access phase (MinION Access Program)
Data and figures are released under a CC0 license at figshare, with doi 10.6084/m9.figshare.100940. I’ve also added the content to Github at https://github.com/lexnederbragt/developments-in-next-generation-sequencing.
As before: although I took utmost care in collecting the data, I may have gotten some of my numbers completely wrong, for which I apologise in advance; please help me correct any mistakes or omissions through leaving a comment, or sending me a pull request.
Finally, the raw data
|Platform||Instrument||Year||Reads per run||Read length||Gigabases per run||Source |
|454||GS FLX Titanium||2009||1000000||500||0.45|
|IonTorrent||PGM 314 chip||2011||100000||100||0.01||3|
|IonTorrent||PGM 316 chip||2011||1000000||100||0.1||3|
|IonTorrent||PGM 318 chip||2011||5000000||100||0.5||3|
|IonTorrent||PGM 318 chip||2012||5000000||200||1||3|
|IonTorrent||PGM 318 chip V2||2013||5000000||400||2||12|
|Illumina||HiSeq 2500 RR||2012||600000000||150||180||13|
|PacBio||RS C2 XL||2012||36000||4300||1.858|
|PacBio||RS II C2 XL||2013||47000||4600||2.594||15|
|PacBio||RS II P5 C3||2014||44000||8500||4.500||15|
 mode or average
 Sources: see this file from the github repo.