<-- Back to schedule

Reproducible Research in Python

You’ve seen a great idea on someone’s blog that you think would really push that old analysis you did 6 months ago to the next level. You open up the Dropbox folder you have with all of your scripts, and … you’re lost. Which script did you start with? What does this random chunk of code do? Where is the original data file? You finally sort out your scripts, but then your code fails every second line because you don't even remember which packages you used before. Frustrated, you give up.

What if I told you that there is a better way to keep track of your analyses, and that it is easier than you think to do so? In this talk I will show you how using a reproducible research approach to your analyses can save you hours of time when revisiting or updating old projects, and demonstrate some of the tools that Python has available to make this possible. This talk will cover how to manage your packages using virtualenvs, how to thoroughly document your analysis using Jupyter Notebook, how to keep track of any changes using source control systems like Git and how to collaborate effectively using GitHub. By the end you will wonder why you’ve ever done your analyses any other way, and will be happily maintaining and improving your projects for many years to come!

Jodie Burchell

Jodie Burchell loves data – no seriously, she loves data. It took a while before she discovered this passion, but during her PhD in psychology she realised that all she wanted to do was apply her knowledge of behavioural sciences and statistics to interesting problems. She currently works as a data scientist in client-side analytics in SEEK Australia. Her favourite languages are Python, R and Stata. When she is not dreaming about analyses, she enjoys baking, studying Spanish (badly) and reading Reddit.