From 6ff9fd680989727c47f088e711b43cfab756d6d3 Mon Sep 17 00:00:00 2001 From: Paul McCarthy <pauldmccarthy@gmail.com> Date: Wed, 24 Jan 2018 23:15:07 +0000 Subject: [PATCH] Started work on numpy introduction --- getting_started/04_numpy.ipynb | 276 +++++++++++++++++++++++++++++++++ getting_started/04_numpy.md | 191 +++++++++++++++++++++++ 2 files changed, 467 insertions(+) create mode 100644 getting_started/04_numpy.ipynb create mode 100644 getting_started/04_numpy.md diff --git a/getting_started/04_numpy.ipynb b/getting_started/04_numpy.ipynb new file mode 100644 index 0000000..f0a4507 --- /dev/null +++ b/getting_started/04_numpy.ipynb @@ -0,0 +1,276 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Numpy\n", + "\n", + "\n", + "This section introduces you to [`numpy`](http://www.numpy.org/), Python's\n", + "numerical computing library.\n", + "\n", + "\n", + "Numpy is not actually part of the standard Python library. But it is a\n", + "fundamental part of the Python ecosystem - it forms the basis for many\n", + "important Python libraries, and it (along with its partners\n", + "[`scipy`](https://www.scipy.org/) and [`matplotlib`](https://matplotlib.org/))\n", + "is what makes Python a viable alternative to Matlab as a scientific computing\n", + "platform.\n", + "\n", + "\n", + "## Contents\n", + "\n", + "\n", + "* [The Python list versus the Numpy array](#the-python-list-versus-the-numpy-array)\n", + "* [Importing Numpy](#importing-numpy)\n", + "* [Numpy basics](#numpy-basics)\n", + "* [Indexing](#indexing)\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"the-python-list-versus-the-numpy-array\"></a>\n", + "## The Python list versus the Numpy array\n", + "\n", + "\n", + "Numpy adds a new data type to the Python language - the `array` (more\n", + "specifically, the `ndarray`). You have already been introduced to the Python\n", + "`list`, which you can easily use to store a handful of numbers (or anything\n", + "else):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "data = [10, 8, 12, 14, 7, 6, 11]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You could also emulate a 2D or ND matrix by using lists of lists, for example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "xyz_coords = [[-11.4, 1.0, 22.6], [22.7, -32.8, 19.1], [62.8, -18.2, -34.5]]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For simple tasks, you could stick with processing your data using python\n", + "lists, and the built-in\n", + "[`math`](https://docs.python.org/3.5/library/math.html) library. And this\n", + "might be tempting, because it does look quite a lot like what you might type\n", + "into Matlab.\n", + "\n", + "\n", + "But __BEWARE!__ A Python list is a terrible data structure for scientific\n", + "computing!\n", + "\n", + "\n", + "This is a major source of confusion for those poor souls who have spent their\n", + "lives working in Matlab, but have finally seen the light and switched to\n", + "Python. It is very important to be able to distinguish between a Python list,\n", + "and a Numpy array.\n", + "\n", + "\n", + "A list in python is akin to a cell array in Matlab - they can store anything,\n", + "but are extremely inefficient, and unwieldy when you have more than a couple\n", + "of dimensions.\n", + "\n", + "\n", + "These are in contrast to the Numpy array and Matlab matrix, which are both\n", + "thin wrappers around a contiguous chunk of memory, and which provide\n", + "blazing-fast performance (because behind the scenes in both Numpy and Matlab,\n", + "it's C, C++ and FORTRAN all the way down).\n", + "\n", + "\n", + "So you should strongly consider turning those lists into Numpy arrays:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "data = np.array([10, 8, 12, 14, 7, 6, 11])\n", + "\n", + "xyz_coords = np.array([[-11.4, 1.0, 22.6],\n", + " [ 22.7, -32.8, 19.1],\n", + " [ 62.8, -18.2, -34.5]])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you look carefully at the code above, you will notice that we are still\n", + "actually using Python lists. We have declared our data sets in exactly the\n", + "same way that we did earlier, by denoting them with square brackets `[` and\n", + "`]`.\n", + "\n", + "\n", + "The key difference here is that these lists immediately get converted into\n", + "Numpy arrays, by passing them to the `np.array` function. To clarify this\n", + "point, we could rewrite this code in the following equivalent manner:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "# Define our data sets as python lists\n", + "data = [10, 8, 12, 14, 7, 6, 11]\n", + "xyz_coords = [[-11.4, 1.0, 22.6],\n", + " [ 22.7, -32.8, 19.1],\n", + " [ 62.8, -18.2, -34.5]]\n", + "\n", + "# Convert them to numpy arrays\n", + "data = np.array(data)\n", + "xyz_coords = np.array(xyz_coords)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "I'm emphasising this to help you understand the difference between Python\n", + "lists and Numpy arrays. Apologies if you've already got it.\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"importing-numpy\"></a>\n", + "## Importing numpy\n", + "\n", + "\n", + "For interactive exploration/experimentation, you might want to import\n", + "Numpy like this:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from numpy import *" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This makes your Python session very similar to Matlab - you can call all\n", + "of the Numpy functions directly:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "e = array([1, 2, 3, 4, 5])\n", + "z = zeros((100, 100))\n", + "d = diag([2, 3, 4, 5])\n", + "\n", + "print(e)\n", + "print(z)\n", + "print(d)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "But if you are writing a script or application using Numpy, I implore you to\n", + "Numpy like this instead:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The downside to this is that you will have to prefix all Numpy functions with\n", + "`np.`, like so:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "e = np.array([1, 2, 3, 4, 5])\n", + "z = np.zeros((100, 100))\n", + "d = np.diag([2, 3, 4, 5])\n", + "\n", + "print(e)\n", + "print(z)\n", + "print(d)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There is a big upside, however, in that other people who have to read/use your\n", + "code will like you a lot more. This is because it will be easier for them to\n", + "figure out what the hell your code is doing. Namespaces are your friend - use\n", + "them!\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"numpy-basics\"></a>\n", + "## Numpy basics\n", + "\n", + "\n", + "Let's get started." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a class=\"anchor\" id=\"indexing\"></a>\n", + "## Indexing" + ] + } + ], + "metadata": {}, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/getting_started/04_numpy.md b/getting_started/04_numpy.md new file mode 100644 index 0000000..d395b6a --- /dev/null +++ b/getting_started/04_numpy.md @@ -0,0 +1,191 @@ +# Numpy + + +This section introduces you to [`numpy`](http://www.numpy.org/), Python's +numerical computing library. + + +Numpy is not actually part of the standard Python library. But it is a +fundamental part of the Python ecosystem - it forms the basis for many +important Python libraries, and it (along with its partners +[`scipy`](https://www.scipy.org/) and [`matplotlib`](https://matplotlib.org/)) +is what makes Python a viable alternative to Matlab as a scientific computing +platform. + + +## Contents + + +* [The Python list versus the Numpy array](#the-python-list-versus-the-numpy-array) +* [Importing Numpy](#importing-numpy) +* [Numpy basics](#numpy-basics) +* [Indexing](#indexing) + + +<a class="anchor" id="the-python-list-versus-the-numpy-array"></a> +## The Python list versus the Numpy array + + +Numpy adds a new data type to the Python language - the `array` (more +specifically, the `ndarray`). You have already been introduced to the Python +`list`, which you can easily use to store a handful of numbers (or anything +else): + + +``` +data = [10, 8, 12, 14, 7, 6, 11] +``` + + +You could also emulate a 2D or ND matrix by using lists of lists, for example: + + +``` +xyz_coords = [[-11.4, 1.0, 22.6], [22.7, -32.8, 19.1], [62.8, -18.2, -34.5]] +``` + + +For simple tasks, you could stick with processing your data using python +lists, and the built-in +[`math`](https://docs.python.org/3.5/library/math.html) library. And this +might be tempting, because it does look quite a lot like what you might type +into Matlab. + + +But __BEWARE!__ A Python list is a terrible data structure for scientific +computing! + + +This is a major source of confusion for those poor souls who have spent their +lives working in Matlab, but have finally seen the light and switched to +Python. It is very important to be able to distinguish between a Python list, +and a Numpy array. + + +A list in python is akin to a cell array in Matlab - they can store anything, +but are extremely inefficient, and unwieldy when you have more than a couple +of dimensions. + + +These are in contrast to the Numpy array and Matlab matrix, which are both +thin wrappers around a contiguous chunk of memory, and which provide +blazing-fast performance (because behind the scenes in both Numpy and Matlab, +it's C, C++ and FORTRAN all the way down). + + +So you should strongly consider turning those lists into Numpy arrays: + + +``` +import numpy as np + +data = np.array([10, 8, 12, 14, 7, 6, 11]) + +xyz_coords = np.array([[-11.4, 1.0, 22.6], + [ 22.7, -32.8, 19.1], + [ 62.8, -18.2, -34.5]]) +``` + + +If you look carefully at the code above, you will notice that we are still +actually using Python lists. We have declared our data sets in exactly the +same way that we did earlier, by denoting them with square brackets `[` and +`]`. + + +The key difference here is that these lists immediately get converted into +Numpy arrays, by passing them to the `np.array` function. To clarify this +point, we could rewrite this code in the following equivalent manner: + + +``` +import numpy as np + +# Define our data sets as python lists +data = [10, 8, 12, 14, 7, 6, 11] +xyz_coords = [[-11.4, 1.0, 22.6], + [ 22.7, -32.8, 19.1], + [ 62.8, -18.2, -34.5]] + +# Convert them to numpy arrays +data = np.array(data) +xyz_coords = np.array(xyz_coords) +``` + + +I'm emphasising this to help you understand the difference between Python +lists and Numpy arrays. Apologies if you've already got it. + + +<a class="anchor" id="importing-numpy"></a> +## Importing numpy + + +For interactive exploration/experimentation, you might want to import +Numpy like this: + + +``` +from numpy import * +``` + + +This makes your Python session very similar to Matlab - you can call all +of the Numpy functions directly: + + +``` +e = array([1, 2, 3, 4, 5]) +z = zeros((100, 100)) +d = diag([2, 3, 4, 5]) + +print(e) +print(z) +print(d) +``` + + +But if you are writing a script or application using Numpy, I implore you to +Numpy like this instead: + + +``` +import numpy as np +``` + + +The downside to this is that you will have to prefix all Numpy functions with +`np.`, like so: + + +``` +e = np.array([1, 2, 3, 4, 5]) +z = np.zeros((100, 100)) +d = np.diag([2, 3, 4, 5]) + +print(e) +print(z) +print(d) +``` + + +There is a big upside, however, in that other people who have to read/use your +code will like you a lot more. This is because it will be easier for them to +figure out what the hell your code is doing. Namespaces are your friend - use +them! + + +<a class="anchor" id="numpy-basics"></a> +## Numpy basics + + +Let's get started. + + +``` +import numpy as np +``` + + +<a class="anchor" id="indexing"></a> +## Indexing -- GitLab