This post originally appeared on the Software Carpentry blog.
I am a biologist with no formal training in Computational Science. A couple of years ago, the increasing size of my data forced me to stop using Excel, and switch to the Unix command line and scripting for my analyses. In the process, I learned what I needed by mostly by just doing it, but also from books, websites, and of course google.
Almost exactly one year ago, I attended a Software Carpentry bootcamp. I had heard about Software Carpentry and its bootcamps through twitter, started following their blog and became convinced that this was something I wanted to attend. At some point, I fired off an email to Software Carpentry asking what it would take to have a bootcamp at our university, the University of Oslo in Norway. The answer came down to ‘get us a room and advertise the event, and we’ll provide teachers’. This, in fact, was what happened, and as teachers, we got Mr Software Carpentry himself, Greg Wilson, who taught together with a local teacher (Hans Petter Langtangen).
This first bootcamp in Oslo (and in Norway) taught me a lot, and has in fact very much changed the way I do my work. These changes came gradual over the last year, and I am still not finished adjusting completely yet.
The first immediate effect was that I permanently switched from perl to python as my scripting language. This process had already started but using that language at the bootcamp convinced me that python is a better language for beginners. In the end, it doesn’t matter what language you use. But I now feel that for people who need to learn a scripting language for the first, and probably only, time, python is much easier to grasp and makes for code that is more understandable.
Secondly, I now find myself automating my work on the command line. No more copy-and-paste a few lines of code, change a few small things, run again and discovering (or not…) that I forgot to change one of the ‘1’s into a ‘2’. Small scripts and for-loops to the rescue!
Also, I am (finally?) using version control, Git to be exact. It turned out our university offers Git repositories to its employees, both private and public repos, and repos for teams of researchers. No excuse not to use it! I am also putting the course work I develop on GitHub: have a look at the latest instalment of my two-day de novo genome assembly module. I haven’t begun to master branching yet, but that doesn’t matter, Git is already extremely useful.
Even though I am not writing huge pieces of code, I still find it very useful to include a few unit tests in my scripts: a set of expected outcomes based on some input, and the obvious ’empty’ input, input that shouldn’t give any result etc. I have already experienced making a small code adjustment that worked fine, but broke an essential test. Discovering this without the test would not have happened until much later.
I also am using the IPython Notebook more and more. I love the interactive coding, easy data exploration and visualisation, and possibilities to make nice looking, sharable documents. As an example, I recently shared code I developed for a blog post in an IPython Notebook as a GitHub gist, and as a formatted Notebook.
Last but not least, I became a Software Carpentry instructor. I first followed the online training program. Then, together with colleague Karin Lagesen, we held the second Oslo Software Carpentry bootcamp last July. To me, it was fantastic to be able to share my newfound knowledge with researchers from different fields, and see the same ‘aha’-moment looks in their eyes during the bootcamp.
Today there are many more such self-taught bioinformaticians like me in our group, and other research groups around Norway. In my experience, the Software Carpentry bootcamp is exactly fitting for people like us, and I am looking forward to inspire coworkers and other researchers to apply the Software Carpentry principles.