<-- Back to schedule

Big data biology for pythonistas: getting in on the genomics revolution

In 2001 Bill Clinton unveiled "the most important, most wondrous map ever produced by humankind" - the human genome. This monumental endeavour cost $3 billion, and took hundreds of scientists from all over the world 13 years. Today, a single person can generate such a map in ~2 days for $1000. This dramatic drop in cost means that we now have data for hundreds of thousands of people - and other species - from all corners of the globe, and cohorts are available for every major disease under the sun. Petabytes of new data are also being generated every day.

Most of this data is publicly available, so anyone with an internet connection can try in silico biology from the comfort of their own home. In my talk, I'll walk through what this data looks like, and how it's analysed - with a special focus on where python fits into the workflow (;tldr the most interesting parts!). I will also highlight some common pitfalls software engineers and developers face when getting into this space. Finally, I'll showcase several other facets of bioinformatics that sorely need contributions from good coders.

Genomics is rapidly entering the world of health care in both the public and private hospital sectors, and in direct-to-consumer genetic testing. Understanding this data, the challenges and limitations of its analytics will help us all make better-informed health and medical decisions, affecting our quality of life and those we love.

Darya Vanichkina

Dr Darya Vanichkina is a genomics data scientist at the Centenary Institute, University of Sydney. She is interested in the biology of the nervous and immune systems, and the roles that alternative splicing plays in modulating cellular phenotypes. This translates to a lot of next generation sequencing data analysis using the University's HPC cluster and AWS, all with the aim of understanding the basic biology of these systems, and what breaks in disease. She holds a bioinformatics and genomics PhD from the University of Queensland, and a Bachelors/Masters Degree in Biochemistry and Molecular Biology from Lomonosov Moscow State University.