The Roche/454 Life Sciences GS FLX and GS Junior: an obituary

You were a pioneer, the first successful ‘next generation’ (if you’ll pardon the term) commercially available sequencing platform in 2005. You just beat Solexa, but it was a fairly close call.

gsflx-uio1

The author (left) with colleagues showing off their 454 GS FLX

Your greatest accomplishment was to show that pyrosequencing, which was around for a while already, could be scaled up, both in terms of read length and parallelisation. You started the revolution in DNA sequencing, suddenly making large scale genomic projects available to labs that traditionally only could dream of been doing such projects at this scale.

And then you were swallowed up by the giant, Roche diagnostics. Opinions will always be divided whether that was a good thing for you. Some say Roche, with its large sales network, enabled much more rapid spreading of your technology into research labs all over the world. Others will blame Roche for ultimately stifling innovation and progress.

As with any new (sequencing) technology, you had a challenge making sure researchers would be able to use the data they were getting. Luckily, you employed a bunch of smart bioinformaticians and programmers, leading to the newbler (gsAssembly, gsMapping) applications. I have been, and still am, a fan of this software. I have shared my enthusiasm by writing the ‘user-perspective manual’ for newbler (gsAssembler). I have unsuccessfully tried to make your owner open up this software, asking you, on behalf of more than 200 users, to make it open source. Your superiors unfortunately refused. I will never give up hope that this still can become part of your legacy.

You grew fast. First, you were only able to give us 100 nucleotide long reads – much shorter than what we were used to from ‘Sanger’ sequencing, but significantly longer than your competitor, with their initial 25 or 35 nucleotide reads. It wasn’t a very wise decision to have the expected output (20 million basepairs per run) as part of the name of your first instrument (GS20, Genome Sequencer 20) as you could have known your output would grow. The name of the updated instrument that came in 2007, GS FLX, was much better. Not only was it flexible in terms of run setup (2, 4, 8 or 16 lanes), it also allowed you to stick to the name once new chemistry and software updated pushed read length and output. Now you could do 250 bp and ultimately 450 Mbp per run. Later you managed to increase to 500 bp with Titanium chemistry, and even to around Sanger lengths, up to 1 kb, with GS FLX+ – even though this meant a slight modification to the instrument to enable larger supplies to be used.

You got a little brother, aptly named the GS Junior. It was at the time when ‘desktop’ instruments became the new thing for all platforms, enabling also smaller labs to purchase a sequencing instrument.

You enabled a lot of science. According to your website, there are 3082 peer-reviewed publications enabled by 454 Sequencing technology – but the most recent one listed there is from 2013 so the real number is probably quite a bit higher. Some scienctific highlights are

  • the study presenting your technology in 2005
  • the presence of ammonia-oxidizing archaea (AOA) in soil in 2006
  • microbial diversity in the deep sea in 2006
  • the obesity-associated gut microbiome in mice in 2006
  • the diploid genome of James D. Watson in 2008
  • the sequencing of the Zebrafish antibody repertoire in 2009
  • a Neandertal Genome (also involving data from your competitor) in 2010

I sometimes think you did not really reach full potential. Why not miniaturise even more, aim for even longer reads, make a new instrument that had ten, or a hundred times more output? We’ll never know, now, what our R&D crew were secretly working on. Your competitors surely did improve and increase throughput. Notably, one of your founders even started from scratch, reportedly aiming to ‘do everything right’ that you weren’t – at least in their opinion. They managed partially, but due to strong competition from the giant of all giants in the sequencing world, are now pushed back into the niche market of targeted, gene-panel based projects.

You forced me to become a bioinformatician. Once I started to work with your data, they no longer fitted into excel, and I had to start using the command-line, and scripting. I am very glad this happened, it was a rewarding transition for me and made my research life much more interesting. At the University of Oslo, you lay the foundation of what later became the Norwegian Sequencing Centre, a national sequencing technology service platform, which still is in operation today. We always have had a research-centred approach, using the technology we offer our users ourselves to make sure we understand it in and out. We used you to sequence several bacteria we were working on, and, an achievement I am still very proud of, to produce the first version of the Atlantic cod genome in 2011. That our lab was able to do this project was unthinkable before you and your competitors made sequencing much more accessible. We are still very grateful for all the help your owners have given us during the realisation of the project – we really did this together.

It was a sad day when it was announced you would be discontinued as a technology. I guess it is a natural thing for new technology to become outdated and replaced (the exception maybe being the Sanger sequencing technology, which still is around in the same form as when you were launched). Our instruments are still in that lab, as we cannot bear to think dumping them – hopefully one of them can go to a museum.

I’ll end with a big thank you. Thank you for pushing the boundaries, for your collaborative spirit, for the well-organised user meetings, for enabling all this science, and for turning me into a bioinformatician.

May you rest in peace and never be forgotten.

 
P.S. was the name of your mother company suggestive of the major sequencing error? “here we read 4, eh 5, no really 4 A’s”

11 thoughts on “The Roche/454 Life Sciences GS FLX and GS Junior: an obituary

  1. This obituary definitively summarize lots of the things that 454 was and could have been as the first NGS tech. It brought back some memories since I was probably one of those enthusiasts like you that started using this new technology to perform many genome assemblies along with Sanger reads. Hacking the Sanger read headers to put them into newbler, using PCAP with newbler contigs and Sanger reads, mixing Sanger and 454 using Broad’s Arachne software were some of my many experiences of hybrid assembly development.

    I never thought that I was one of the pioneers of hybrid assembly but that made me won a place at the Sanger Institute as a postdoc where I generated lots of genome assemblies and was considered as one of the 454 and newbler experts.

    In fact, surfing the net in search for more people like me doing hybrid assemblies, brought me to your blog. I learnt many trick and felt reassured by your posts where many of my hacks were the same as those published by you.

    Also, I was lucky since people at 454 like James Knight helped me a lot while trying to patch some of the newbler’s code. It is a shame that they don’t want to release it as open source. I still use it for Ion Torrent and tried it with some corrected PacBio reads and false Sanger reads from Illumina assembled contigs.

    So, I have to thank 454 and you as well since both influenced in my career. Some say that Roche let it die in order to develop nanopore sequencing as well. Some blame that the emulsion PCR, the cost of reagents and the homopolymer errors. Now, we’ll never know.

    PS. Wasn’t 454 name the number if tries that took to have the first successful run?

  2. Nice piece! Two 454 milestones I note are the 1st mass sequencing of Neanderthal DNA & the first time the mode of resistance of an in-development antibiotic by sequencing a resistant strain (a paper which shamefully relegated this advance to a miserable little footnote)

  3. 2004: I was a technician working for four years in the Sanger Centre’s Pathogen Sequening Unit when I got offered “a trip to America”. Apparently this company called “454” had a new sequencer that we were keen to bring to Hixton.

    What happened next was a whirlwind. We resequenced S. suis and C. michiganensis within days – (despite the first ever run red-screening!) with few errors beyond homopolymers.

    Within weeks i was helping to run R&D. I had to learn aspects of molecular biology and bioinformatics I had never heard of. I was teaching people pyrosequencing. I was talking to people from all over the institute about how we could apply it to their projects – mouse genomes, cancer – whatever. I was flying around presenting data at conferences. As if a pre-PhD 25 year old could be a plenary speaker at AGBT.

    It is no understatement that the 454 GS20 kick started my career and it’ll always hold a special place with me. Despite that damn emulsion PCR.

  4. RIP 454. Thank you for the article, though. Nice memories. I worked at 454 for 11 years (2001 – 2012) where, among other things, I wrote all the User Manuals you used. Fun times… 🙂

      • Yeah, what about that 700+ pages software manual we had at the end? I must say, though, that I was no longer alone by then: I led a team of 3 writers by that time. It’s nice to see that people are still finding and responding to your post. Cheers!

  5. 454 was (and is) hands down the best NGS technology for repeats, and genomes. minION is starting to get there (2nd half of 2017), but 454 was there 10 years prior and without the high error rate.

    • Thanks! However, with the new long reads and very good algorithms for using them despite their raw error rates, 454 would these days no longer be considered the best solution to resolve repeats and assemble (complex) genomes…

      • I agree with Lex—reads longer than repeat lengths are needed to really pin down assemblies. The 454 was great because of its relatively long reads and low error rates (aside from homopolymer errors), but any repeat longer than about 500 bases was a disaster. The very long reads of PacBio and MinION are much better for resolving repeat structures, though it helps to have low-error data like Illumina for getting the details right.

Leave a comment