SDAV Primer


In this worksheet, you will learn:

  • Getting up and running with Python, Jupyter, and the data science stack

1. Check that Python is Installed

Firstly, you will want to check that Python 3 is installed on your machine. Simply type: python in the command prompt and, if it is installed, you should see the intepreter begin and the version number will be shown at the top of the view.

E.g.

Python 3.6.1 |Anaconda 4.4.0 (x86_64)| (default, May 11 2017, 13:04:09)

If Python is not installed then I recommend installing the Anaconda package that provides Python 3.6.1 with many data science libraries pre-installed (Download link)

2. Check that Jupyter is Installed

If Jupyter is installed, then you can start this from the command line using:

jupyter notebook

If you have installed using Anaconda then you will have many of the typically libraries that we will use such as NumPy, SciPy, Pandas, Matplotlib. However, there will be some libraries that may still requiring installation (e.g., Tensorflow)

3. Check that Jupyter is Installed

To install new libraries, we can use pip.

E.g., pip install scikit-learn

Some of the useful libraries for this course are:

pip install numpy scipy pandas matplotlib seaborn bokeh scikit-learn tensorflow nltk flask

(You can install multiple packages at once using pip, just by appending the command)

4. All Set

If you have Jupyter working with Python 3.6, and you have the additional libraries installed then you are all set for the course! Much of the work we do will be using Jupyter notebooks, so the more practice you can get of using them the better!