Experiences with the first edition of “Introduction to Computational Modelling for the Biosciences”

Earlier this year, I wrote a post about the new first-semester bachelor course “Introduction to Computational Modelling for the Biosciences” ​at our institute. A quick summary:

  • from 2017, the Biosciences bachelor study program will incorporate Computing in Science Education (CSE) into the different subjects
  • a new course “Introduction to Computational Modelling for the Biosciences” will teach first-semester students python programming and basic mathematical modeling
  • I am the lead organiser of the course (called BIOS1100) that started fall 2017

Yesterday, I finished grading the exam, so it is about time for a recap: how did it go?

The basics

There were 200 students present at the first of the weekly lectures.

Me lecturing for 200 students for the first time

Each week, they would have a four-hour datalab session in the new ‘bring-your-own-device’ teaching room, where 60 students would sit in groups of 6 at hexagonal tables, with for each table a big screen available that any student could connect to from their laptop (all screens could project what was shown on the central screen, or on any of the other screens, really fantastic).

Teaching room with hexagonal tables and a big screen for each table

A typical datalab would start with formative assessment in the form of multiple choice questions (see ‘Peer instruction through quizzes’ in this blog post). This was often followed by ‘Computer Science Unplugged‘-type activities, pen-and-paper exercises to introduce programming concepts. Then they would work on programming exercises.

The programming environment we used was Jupyter Notebooks, accessed through a JupyterHub instance built for this and similar courses at the university. Exercises, and solutions the following week, were handed out as notebooks. Students were supposed to deliver one assignment, as a notebook, each week. I’m quite proud of the fact that we were able to deliver the course book chapters not only as PDF files, but also as Jupyter Notebooks, with runnable code (the notebooks were stripped of code output, and students were suggested to ‘run’ the notebook interactively while studying it). More on that another time…

Curriculum and book

The figure below gives a broad overview of the structure of the book, a structure that was followed during the course. The philosophy for the book, and the course, was not teaching lots of Python first and then showing how it can be useful to solve biological problems, but introduce the biological problems first, and the Python concepts needed to solve them right after that. This way, all the learning of Python happened in the context of the Biology. The course book and materials will be published (Open Access!) in 2018.

Curriculum overview

Challenges

In my previous post I identified a number of plans and challenges I had in advance of the first course edition, and I will comment on them here.

Ground the material in biology

I wanted examples and exercises to be using data and questions that the students can relate to, given that they are studying biology/biosciences. This was implemented to a large extent, and I feel this worked quite well.

Build bridges with other courses

In preparation for the course, multiple ways were discussed to build a bridge with the Cell- and Molecular Biology course the students were taking the same semester. Students used Python to plot some of their laboratory measurements, calculated resolutions for different microscope lenses, digested the same DNA that they were going to digest using restriction enzymes in the molecular biology lab, in silico with Python. Students responded positive to having the exercises being relevant for their field of study, as well as seeing material used in both courses.

Use methods and create an environment that enhance learning

I am generally very happy with the choice of the Jupyter Notebook and the course room. There were no software installation issues and having a completely similar online environment (through JupyterHub) for all students, to which I could easily push new material, made organising the course much easier. There were some technical difficulties with the JupyterHub implementation – we were the first course to use it and ran into some unfortunate downtime, leading to understandable student frustration.

I had intended to use a ‘flipped classroom’ model, but did not manage to find a good way to make sure students had studied the relevant material from the book in advance.

Build a community of learning assistants

I had 19 learning assistants, master student and PhD students. Recruiting them happened rather late, which prevented me from having more than one meeting with them to prepare them for the course. But they were an enthusiastic bunch, and worked hard with the students. Much of the learning that happened was thanks to them. I struggled to really built a community, not many showed up at the (voluntary) weekly meeting.

Challenges I had identified in advance

  • the motivational aspect: these students chose to study biology, not programming. Although I have not done a careful analysis, talking to students told me many saw the relevance of the course material for their studies in general.
  • one goal for the study program is to continue learning and implementing CSE in the student’s next semesters. So far, in the student’s second semester, they will meet Python again in two of the three courses they take. In the third semester, they switch to R in the statistics and evolutionary biology courses.

Challenges that became apparent during the course

Exercises

A big challenge with any (totally) new course is that much, or all, of the material needs to prepared for the first time. A lot of this work was done before the course started, but unfortunately, not many exercises were ready beforehand. This led to me spending most of my time during the semester preparing exercise notebooks. This situation had several drawbacks: students and assistants did not have access to next week’s exercise material until the first day it was going to be used in class. I could not spend much time interacting with the students during their datalabs. It was also quite stressful.

Teaching programming successfully to new students

Around two-thirds into the course it became clear that a significant fraction of students felt they did not understand much, or anything, of the Python programming that was taught. The book was too difficult for them, many exercises too challenging, the format simply did not work for them. For the more mature students (that had studied elsewhere before) or those more or less familiar with programming, the format worked very well, but not for the rest. I’ll admit that this came as a bit of a shock, and I felt somewhat terrible about it for a while. In reflecting on his, I then posted this tweet, that I’ll explain here:

When I first thought about how to organise the teaching for this course, I strongly believed that the Software Carpentry approach of live-coding would be the best way to do it – it is after all how I do most of my teaching. With live-coding, an instructor or teacher does programming in real time and students in the room follow along, performing the same programming themselves – intermitted with many hands-on exercises. But at some point I started to worry live-coding wouldn’t scale to the 60 students present in the datalab, let alone the 200 students during lecture. Consequently, I dropped live-coding as a possible approach to teaching Python for this course.

When I discovered that many students were not learning the material, I decided to switch gears and go back to the live-coding approach. I announced that we were going to skip the material of the last two chapters of the book, and replace the remaining datalabs with repeat-teaching of the programming concepts. I did this through live-coding with the students, and also (finally) found a set of hands-on exercises that were very good for practicing more. The last three weeks of teaching were done in this way, and around 60-70 students came to these sessions. Feedback from the students was very positive: many were relieved and thankful, and felt they finally understood what this programming business was about. Interestingly, a handful of students were disappointed that the last part of the course material was skipped and asked whether they could still get access to it. In the words of the study-administration: “students never ask for more material, but for this course they did…!”.

In conclusion, I can only repeat “I thought the Software Carpentry live-coding approach wouldn’t scale to a large undergrad Python course and that I could do without it. I was 2x wrong”.

Exam results

So how did the students do? 180 students were allowed to take the exam, and 169 did. 80 % passed the exam, with an average C grade (scale A-F). From looking through the results, I saw a lot of learning has happened – although quite some misunderstandings remained. I decided that these results are actually not too bad for a first edition, and I’m actually quite happy with them.

Next time

I was recently asked what the biggest changes are that I want to make for the next time the course is taught (fall semester of 2018). Here was my answer:

  • add another course responsible: courses are supposed to have two people in charge, and for this course, so far it has only be me. This is partly due to a lack, at least currently, of permanent staff at our institute with enough Python programming competence. Nonetheless, I will need a ‘buddy’, if only to enable me to get sick in the middle of the semester. I also hope this person can help strengthen the mathematics aspect of the course, which did not recieve enough attention this first time
  • use live-coding from the beginning: I am considering dropping the lectures (which I did not manage find a good format for anyways) and start datalabs during the first weeks with live-coding, perhaps even for the entire course. It can’t just be me doing all the live-coding, I’ll have to recruit and train learning assistants to take part of this work too
  • have exercises ready before the semester starts, implying quite bit of work needs to be done during next spring semester

All in all, organising this course has been (one of) the biggest and demanding projects I have ever undertaken, but I don’t regret one second accepting the challenge. The current generation of students will meet big datasets and complex analysis and modelling (Machine Learning! Artificial Intelligence!) whatever they end up doing in their professional lives. I am very proud to be a part of a (larger) effort to make sure they will be prepared for this once they graduate from our university.

Advertisements

A video introduction to instructing by means of live coding

As part of my training to become an instructor-trainer for Software and Data Carpentry, I want to help further develop the material used during instructor training workshops. Greg Wilson, who heads the instructor training, and I, decided to make some videos to demonstrate good and not-so-good practices when teaching workshops. Greg recently released his “example of bad teaching” video focussing on general teaching techniques.

For my contribution, I wanted to demonstrate as many aspects as I could of what I wrote in my “10 tips and tricks for instructing and teaching by means of live coding” post.

So here was the plan:

  • make two 2-3 minute videos with contrasting ways of doing a live coding session
  • one demonstrates as many ways as possible how to not do this
  • one uses as many good practices as possible
  • during the instructor-training workshop, participants are asked (in small groups) to discuss the differences and their relevance.

With help from colleague Tore Oldeide Elgvin (the cameraman) and local UiO Carpentry organisers Anne Fouilloux and Katie Dean (playing the role of learners), we recorded the videos. It took about two hours and a dozen attempts, but it was fun to do. Amazing how difficult it is to not doing your best while teaching…

Here are the videos – watch them before you read on about what they were supposed to show. Note that (part of) the unix shell ‘for-loop’ lesson is what is being taught. It is assumed the instructor has already explained shell variables (when to use the ‘$’ in front and when not).

Many thanks to Tore, Anne and Katie for helping out making these videos!

Part 1:

Part 2:

Part 1:

  • instructor ignores a red sticky clearly visible on a learner’s laptop
  • instructor is sitting, mostly looking at the laptop screen
  • instructor is typing commands without saying them out loud
  • instructor uses fancy bash prompt
  • instructor uses small font in not full-screen terminal window with black background
  • the terminal window bottom is partially blocked by the learner’s heads for those sitting in the back
  • instructor receives a a pop-up notification in the middle of the session
  • instructor makes a mistake (a typo) but simply fixes it without pointing it out, and redoes the command

Part 2:

  • instructor checks if the learner with the red sticky on her laptop still needs attention
  • instructor is standing while instructing, making eye-contact with participants
  • instructor is saying the commands out loud while typing them
  • instructor moves to the screen to point out details of commands or results
  • instructor simply uses ‘$ ‘ as bash prompt
  • instructor uses big font in wide-screen terminal window with white background
  • the terminal window bottom is above the learner’s heads for those sitting in the back
  • instructor makes mistake (a typo) and uses the occasion to illustrate how to interpret error-messages

Carpentry week 2016 at the University of Oslo

In March 14-18 2016 we organised the first Carpentry week at the University of Oslo. After a mini-Seminar on Open Data Skills, there was a Software Carpentry workshop, two Data Carpentry workshops and a workshop on Reproducible Science as well as a ‘beta’ Library Carpentry workshop.

The Software and Data Carpentry effort at the University of Oslo, aka ‘Carpentry@UiO’, really started in 2012 when I invited Software Carpentry to give a workshop at the university. The then director, Greg Wilson, came himself and gave an inspirational workshop – recruiting Karin Lagesen and I to become workshop instructors in the process. Karin and I graduated from instructor training spring 2013 and have been giving a couple of workshops in Oslo and elsewhere.

Continue reading

On being an instructor for Software and Data Carpentry

I was recently asked to provide a testimonial on why I am an instructor for Software Carpentry and Data Carpentry. Here it is:

Teaching in general, and at Software and Data Carpentry workshops in particular, gives me great pleasure and is one of the most personally rewarding activities I engage in. With Software Carpentry, I feel I belong to a community that shares many of the same values I have: openness, tolerance, a focus on quality in teaching to name a few. The instructor training program is the best pedagogical program I know of, and it is amazing to see how Software and Data Carpentry are building a community of educators that are fully grounded in the research on educational practices.

Being an instructor is my way of making a small, but hopefully significant, contribution to improving science, and thus the world.

This testimonial can also be found here.

Notes from the ”FEBS-IUBMB workshop on education in molecular life sciences”

I attended the ”FEBS-IUBMB workshop on education in molecular life sciences”, 18 – 19 SEPT 2015, in Oslo, Norway. Although ‘molecular life sciences’ is part of the workshop title, many of what was discussed was applicable to a much wider range of subjects.

At the workshop, I presented a poster based on my recent blog post on “Active learning strategies for bioinformatics teaching” (the first time I turned a blog post of mine into a poster…). The poster can be viewed on FigShare. I managed to make the poster a bit interactive itself, by having a small quiz on it. The results speak for themselves:

A quiz to make a poster on active learning techniques interactive

A quiz to make a poster on active learning techniques interactive

Continue reading

Active learning strategies for bioinformatics teaching

The more I read about how active learning techniques improve student learning, the more I am inclined to try out such techniques in my own teaching and training.

I attended the third week of Titus Brown’s “NGS Analysis Workshop”. This third week entailed, as one of the participants put it, ‘the bleeding edge of bioinformatics analysis taught by Software Carpentry instructors’ and was a unique opportunity to both learn different analysis techniques, try out new instruction material, as well as experience different instructors and their way of teaching. On top of that the group was just fantastic to hang out with, and we played a lot of volleyball.

I demonstrated some of my teaching and was asked by one of the students for references for the different active learning approaches I used. Rather then just emailing her, I decided to put these in this blog post.

The motivation of turning to active learning techniques is nicely summarised in a post on the ‘communications of the ACM’ blog entitled “Be It Resolved: Teaching Statements Must Embrace Active Learning and Eschew Lecture”. I highly recommended reading it and checking out the references mentioned. I am by no means an expert in the area, and simply am learning by doing. I have no ways to measure whether the techniques I use are beneficial, but student responses strongly encourage me to keep applying them. My teaching is also very much influenced by my being a Software Carpentry instructor.

The following describes what I do in the de novo genome assembly module of the ‘High Throughput Sequencing technologies and bioinformatics analysis’ course I organise (link to materials). I used part of that module for the NGS Analysis Workshop (link).

Continue reading

On the benefits of ‘open’ for teaching

Open source, open data, open course

We recently had the third instalment of the course in Throughput Sequencing technologies and bioinformatics analysis. This course aims to provide students, as well as users of the organising service platforms, basic skills to analyse their own sequencing data using existing tools. We teach both unix command line-based tools, as well as the Galaxy web-based framework.

I coordinate the course, but also teach a two-day module on de novo genome assembly. I keep developing the material for this course, and am increasingly relying on material openly licensed by others. To me, it is fantastic that others are willing to share material they developed openly for others to (re)use. It was hugely inspiring to discover material such as the assembly exercise, and the IPython notebook to build small De Bruijn Graphs (see below). To me, this confirms that ‘opening up’ in science increases the value of material many orders of magnitude. I am not saying that the course would have been impossible without having this material available, but I do feel the course has become much better because of it.

‘Open’ made this course possible

This course used:

  • openly available sequencing data released by the sequencing companies (although some of the Illumina reads are behind a – free – login account)
  • sequencing data made openly available by individual researchers
  • code developed for teaching made available by individual researchers under a permissive license
  • open source software programs

(for a full list or resources, see this document).

I am extremely grateful to the authors/providers of these resources, as they greatly benefitted this course!

Thanks to:

‘Opening up’ is the least I can do to pay back

In exchange, the very least I can do is making my final course module openly available as well.

The rest of this post describes the material and it’s sources in more detail.

Continue reading