To prepare for a tutorial session:
- Ensure your computer has sufficient battery charge for the day; power points will be in high demand!
- Try to install the required software ahead of the day.
- Some tutorial sessions will include time for obtaining the software, but this should not be assumed.
Test the WiFi (802.11) credentials so you can get online for the session.
- Thank your tutors!
Machine Learning Tutorial Software Installation
We will be demonstrating using scikit-learn 0.17 on Python 3.5. Our demonstrations will be in a Jupyter notebook. Here are two ways to install these packages. Note that older versions may work, but they have not been tested. Installing a numerical stack for Python is getting easier all the time, but can still be tricky. We give two recipes below, firstly for pip in a virtual environment and secondly using the Conda package manager.
Option 1. Pip in a virtual environment
- # Setup a virtual env
- /path/to/python3 -m venv ml_tutorial source ml_tutorial/bin/activate
- # Upgrade pip to newest version
- python -m pip install --upgrade pip
- # Install the dependencies
- python -m pip install numpy scipy scikit-learn
Note the pip upgrade is necessary so Linux distributions will use precompiled wheels (see the manylinux (https://github.com/pypa/manylinux) project for details). With a pip version older than 8.1 numpy and scikit-learn will be installed from source*.
* We really don't recommend this.
Option 2. Conda/Miniconda
The Anaconda distribution of Python https://www.continuum.io/downloads includes everything we need in the default installation. You can also create a specific environment for this tutorial following the instructions below. An alternative is http://conda.pydata.org/miniconda.html, or installing the Conda package manager through pip, but you will need to create an environment following the instructions below.
- # After installing conda or miniconda
- conda create -n ml_tutorial python=3.5 scikit-learn
- # Activate the Conda environent:
- # on Mac OS or Linux
- source activate ml_tutorial
- # on Windows
- activate ml_tutorial
The workshop demonstrations will use the Jupyter notebook and Matplotlib for a few plots. If you'd like to work in exactly the same environment you can install these libraries with the below instructions. Note that these are big libraries with many dependencies and we don't recommend trying to set them up on the day using conference wifi.
- # Using pip, in the appropriate virtual environment:
- pip install jupyter matplotlib
- # Inside a Conda environment (these are already installed in the base distribution)
- conda install jupyter matplotlib
The workshop dataset is located here: https://raw.githubusercontent.com/SamHames/MLD/master/data.npz. For the convenience of this workshop it is distributed as a preprocessed set of numpy arrays that we can load directly.
Testing Your Installation
If the following works you are ready for this workshop.
- import numpy as np
- import sklearn
- dataset = np.load('path/to/data.npz')
- images = dataset['images']
- labels = dataset['labels']