To prepare for a tutorial session:

Machine Learning Tutorial Software Installation

We will be demonstrating using scikit-learn 0.17 on Python 3.5. Our demonstrations will be in a Jupyter notebook. Here are two ways to install these packages. Note that older versions may work, but they have not been tested. Installing a numerical stack for Python is getting easier all the time, but can still be tricky. We give two recipes below, firstly for pip in a virtual environment and secondly using the Conda package manager.

Option 1. Pip in a virtual environment

Note the pip upgrade is necessary so Linux distributions will use precompiled wheels (see the manylinux (https://github.com/pypa/manylinux) project for details). With a pip version older than 8.1 numpy and scikit-learn will be installed from source*.

* We really don't recommend this.

Option 2. Conda/Miniconda

The Anaconda distribution of Python https://www.continuum.io/downloads includes everything we need in the default installation. You can also create a specific environment for this tutorial following the instructions below. An alternative is http://conda.pydata.org/miniconda.html, or installing the Conda package manager through pip, but you will need to create an environment following the instructions below.

Optional Elements

The workshop demonstrations will use the Jupyter notebook and Matplotlib for a few plots. If you'd like to work in exactly the same environment you can install these libraries with the below instructions. Note that these are big libraries with many dependencies and we don't recommend trying to set them up on the day using conference wifi.

Workshop Dataset

The workshop dataset is located here: https://raw.githubusercontent.com/SamHames/MLD/master/data.npz. For the convenience of this workshop it is distributed as a preprocessed set of numpy arrays that we can load directly.

Testing Your Installation

If the following works you are ready for this workshop.

TutorialSetups (last edited 2016-08-13 00:35:15 by qinhaidi@gmail.com)