From 7769246abe2fe96454b45b8172532ea3b633ee01 Mon Sep 17 00:00:00 2001 From: Paul McCarthy <pauldmccarthy@gmail.com> Date: Mon, 29 Jan 2018 22:15:33 +0000 Subject: [PATCH] Working on indexing/broadcasting sections. --- getting_started/04_numpy.ipynb | 189 ++++++++++++++++++++++++++++++--- getting_started/04_numpy.md | 147 +++++++++++++++++++++++-- 2 files changed, 313 insertions(+), 23 deletions(-) diff --git a/getting_started/04_numpy.ipynb b/getting_started/04_numpy.ipynb index 0f8ccf3..545ae75 100644 --- a/getting_started/04_numpy.ipynb +++ b/getting_started/04_numpy.ipynb @@ -29,11 +29,13 @@ " * [Array properties](#array-properties)\n", " * [Descriptive statistics](#descriptive-statistics)\n", " * [Reshaping and rearranging arrays](#reshaping-and-rearranging-arrays)\n", + "* [Multi-variate operations](#multi-variate-operations)\n", + " * [Matrix multplication](#matrix-multiplication)\n", + " * [Broadcasting](#broadcasting)\n", "* [Array indexing](#array-indexing)\n", " * [Indexing multi-dimensional arrays](#indexing-multi-dimensional-arrays)\n", " * [Boolean indexing](#boolean-indexing)\n", " * [Coordinate array indexing](#coordinate-array-indexing)\n", - "* [Array operations and broadcasting](#array-operations-and-broadcasting)\n", "* [Generating random numbers](#generating-random-numbers)\n", "\n", "* [Appendix: Importing Numpy](#appendix-importing-numpy)\n", @@ -282,10 +284,6 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We'll cover more advanced array operations\n", - "[below](#array-operations-and-broadcasting).\n", - "\n", - "\n", "<a class=\"anchor\" id=\"array-properties\"></a>\n", "### Array properties\n", "\n", @@ -539,6 +537,120 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "<a class=\"anchor\" id=\"multi-variate-operations\"></a>\n", + "## Multi-variate operations\n", + "\n", + "\n", + "Many operations in Numpy operate on an elementwise basis. For example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.random.randint(1, 10, (5))\n", + "b = np.random.randint(1, 10, (5))\n", + "\n", + "print('a: ', a)\n", + "print('b: ', b)\n", + "print('a + b: ', a + b)\n", + "print('a * b: ', a * b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This also extends to higher dimensional arrays:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.random.randint(1, 10, (4, 4))\n", + "b = np.random.randint(1, 10, (4, 4))\n", + "\n", + "print('a:')\n", + "print(a)\n", + "print('b:')\n", + "print(b)\n", + "\n", + "print('a + b')\n", + "print(a + b)\n", + "print('a * b')\n", + "print(a * b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Wait ... what's that you say? Oh, I couldn't understand because of all the\n", + "froth coming out of your mouth. I guess you're angry that `a * b` didn't give\n", + "you the matrix product, like it would have in Matlab. Well all I can say is\n", + "that Python is not Matlab. Get over it. Take a calmative.\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"matrix-multiplication\"></a>\n", + "*## Matrix multiplication\n", + "\n", + "\n", + "When your heart rate has returned to its normal caffeine-induced state, you\n", + "can use the `dot` method, or the `@` operator, to perform matrix\n", + "multiplication:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.random.randint(1, 10, (4, 4))\n", + "b = np.random.randint(1, 10, (4, 4))\n", + "\n", + "print('a:')\n", + "print(a)\n", + "print('b:')\n", + "print(b)\n", + "\n", + "print('a @ b')\n", + "print(a @ b)\n", + "\n", + "print('a.dot(b)')\n", + "print(a.dot(b))\n", + "\n", + "print('b.dot(a)')\n", + "print(b.dot(a))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `@` matrix multiplication operator is a relatively recent addition\n", + "> to Python and Numpy, so you might not see it all that often in existing\n", + "> code. But it's here to stay, so go ahead and use it!\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"broadcasting\"></a>\n", + "### Broadcasting\n", + "\n", + "\n", + "One of the coolest (and possibly confusing) features of Numpy is its\n", + "[_broadcasting_\n", + "rules](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html).\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", "<a class=\"anchor\" id=\"array-indexing\"></a>\n", "## Array indexing\n", "\n", @@ -553,7 +665,9 @@ "> indices (if specified) are exclusive.\n", "\n", "\n", - "Let's whet our appetites with some basic 1D array slicing:" + "Let's whet our appetites with some basic 1D array slicing. Numpy supports the\n", + "standard Python __slice__ notation for indexing, where you can specify the\n", + "start and end indices, and the step size, via the `start:stop:step` syntax:" ] }, { @@ -594,7 +708,7 @@ "every2nd = a[::2]\n", "print('every 2nd:', every2nd)\n", "every2nd += 10\n", - "print('a':, a)" + "print('a:', a)" ] }, { @@ -680,6 +794,39 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "In contrast to the simple indexing we have already seen, boolean indexing will\n", + "return a _copy_ of the indexed data, __not__ a view. For example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.random.randint(1, 10, 10)\n", + "b = a[a > 5]\n", + "print('a: ', a)\n", + "print('b: ', b)\n", + "print('Setting b[0] to 999')\n", + "b[0] = 999\n", + "print('a: ', a)\n", + "print('b: ', b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> In general, any 'simple' indexing operation on a Numpy array, where the\n", + "> indexing object comprises integers, slices (using the standard Python\n", + "> `start:stop:step` notation), colons (`:`) and/or ellipses (`...`), will\n", + "> result in a __view__ into the indexed array. Any 'advanced' indexing\n", + "> operation, where the indexing object contains anything else (e.g. boolean or\n", + "> integer arrays, or even python lists), will result in a __copy__ of the\n", + "> data.\n", + "\n", + "\n", "Logical operators `~` (not), `&` (and) and `|` (or) can be used to manipulate\n", "and combine boolean Numpy arrays:" ] @@ -708,18 +855,36 @@ "metadata": {}, "source": [ "<a class=\"anchor\" id=\"coordinate-array-indexing\"></a>\n", - "### Coordinate array indexing" + "### Coordinate array indexing\n", + "\n", + "\n", + "You can index a numpy array using another array containing coordinates into\n", + "the first array. As with boolean indexing, this will result in a copy of the\n", + "data. Generally, you will need to have a separate array, or list, of\n", + "coordinates into each data axis:" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ - "<a class=\"anchor\" id=\"array-operations-and-broadcasting\"></a>\n", - "## Array operations and broadcasting\n", - "\n", + "a = np.random.randint(1, 10, (4, 4))\n", + "print(a)\n", "\n", + "rows = [0, 2, 3]\n", + "cols = [1, 0, 2]\n", + "indexed = a[rows, cols]\n", "\n", + "for r, c, v in zip(rows, cols, indexed):\n", + " print('a[{}, {}] = {}'.format(r, c, v))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ "<a class=\"anchor\" id=\"generating-random-numbers\"></a>\n", "## Generating random numbers\n", "\n", diff --git a/getting_started/04_numpy.md b/getting_started/04_numpy.md index 16bc676..69110de 100644 --- a/getting_started/04_numpy.md +++ b/getting_started/04_numpy.md @@ -23,11 +23,13 @@ alternative to Matlab as a scientific computing platform. * [Array properties](#array-properties) * [Descriptive statistics](#descriptive-statistics) * [Reshaping and rearranging arrays](#reshaping-and-rearranging-arrays) +* [Multi-variate operations](#multi-variate-operations) + * [Matrix multplication](#matrix-multiplication) + * [Broadcasting](#broadcasting) * [Array indexing](#array-indexing) * [Indexing multi-dimensional arrays](#indexing-multi-dimensional-arrays) * [Boolean indexing](#boolean-indexing) * [Coordinate array indexing](#coordinate-array-indexing) -* [Array operations and broadcasting](#array-operations-and-broadcasting) * [Generating random numbers](#generating-random-numbers) * [Appendix: Importing Numpy](#appendix-importing-numpy) @@ -212,10 +214,6 @@ print( a % 2) ``` -We'll cover more advanced array operations -[below](#array-operations-and-broadcasting). - - <a class="anchor" id="array-properties"></a> ### Array properties @@ -395,6 +393,96 @@ print( dstacked) ``` +<a class="anchor" id="multi-variate-operations"></a> +## Multi-variate operations + + +Many operations in Numpy operate on an elementwise basis. For example: + + +``` +a = np.random.randint(1, 10, (5)) +b = np.random.randint(1, 10, (5)) + +print('a: ', a) +print('b: ', b) +print('a + b: ', a + b) +print('a * b: ', a * b) +``` + + +This also extends to higher dimensional arrays: + + +``` +a = np.random.randint(1, 10, (4, 4)) +b = np.random.randint(1, 10, (4, 4)) + +print('a:') +print(a) +print('b:') +print(b) + +print('a + b') +print(a + b) +print('a * b') +print(a * b) +``` + + +Wait ... what's that you say? Oh, I couldn't understand because of all the +froth coming out of your mouth. I guess you're angry that `a * b` didn't give +you the matrix product, like it would have in Matlab. Well all I can say is +that Python is not Matlab. Get over it. Take a calmative. + + +<a class="anchor" id="matrix-multiplication"></a> +*## Matrix multiplication + + +When your heart rate has returned to its normal caffeine-induced state, you +can use the `dot` method, or the `@` operator, to perform matrix +multiplication: + + +``` +a = np.random.randint(1, 10, (4, 4)) +b = np.random.randint(1, 10, (4, 4)) + +print('a:') +print(a) +print('b:') +print(b) + +print('a @ b') +print(a @ b) + +print('a.dot(b)') +print(a.dot(b)) + +print('b.dot(a)') +print(b.dot(a)) +``` + + +> The `@` matrix multiplication operator is a relatively recent addition +> to Python and Numpy, so you might not see it all that often in existing +> code. But it's here to stay, so go ahead and use it! + + +<a class="anchor" id="broadcasting"></a> +### Broadcasting + + +One of the coolest (and possibly confusing) features of Numpy is its +[_broadcasting_ +rules](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html). + + + + + + <a class="anchor" id="array-indexing"></a> ## Array indexing @@ -409,7 +497,9 @@ reference](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html). > indices (if specified) are exclusive. -Let's whet our appetites with some basic 1D array slicing: +Let's whet our appetites with some basic 1D array slicing. Numpy supports the +standard Python __slice__ notation for indexing, where you can specify the +start and end indices, and the step size, via the `start:stop:step` syntax: ``` @@ -437,7 +527,7 @@ print('a:', a) every2nd = a[::2] print('every 2nd:', every2nd) every2nd += 10 -print('a':, a) +print('a:', a) ``` @@ -496,9 +586,35 @@ print('elements in a that are > 5: ', a[a > 5]) ``` +In contrast to the simple indexing we have already seen, boolean indexing will +return a _copy_ of the indexed data, __not__ a view. For example: + + +``` +a = np.random.randint(1, 10, 10) +b = a[a > 5] +print('a: ', a) +print('b: ', b) +print('Setting b[0] to 999') +b[0] = 999 +print('a: ', a) +print('b: ', b) +``` + + +> In general, any 'simple' indexing operation on a Numpy array, where the +> indexing object comprises integers, slices (using the standard Python +> `start:stop:step` notation), colons (`:`) and/or ellipses (`...`), will +> result in a __view__ into the indexed array. Any 'advanced' indexing +> operation, where the indexing object contains anything else (e.g. boolean or +> integer arrays, or even python lists), will result in a __copy__ of the +> data. + + Logical operators `~` (not), `&` (and) and `|` (or) can be used to manipulate and combine boolean Numpy arrays: + ``` a = np.random.randint(1, 10, 10) gt5 = a > 5 @@ -518,14 +634,23 @@ print('elements in a which are > 5 or odd: ', a[gt5 | ~even]) ### Coordinate array indexing -``` +You can index a numpy array using another array containing coordinates into +the first array. As with boolean indexing, this will result in a copy of the +data. Generally, you will need to have a separate array, or list, of +coordinates into each data axis: -``` +``` +a = np.random.randint(1, 10, (4, 4)) +print(a) -<a class="anchor" id="array-operations-and-broadcasting"></a> -## Array operations and broadcasting +rows = [0, 2, 3] +cols = [1, 0, 2] +indexed = a[rows, cols] +for r, c, v in zip(rows, cols, indexed): + print('a[{}, {}] = {}'.format(r, c, v)) +``` <a class="anchor" id="generating-random-numbers"></a> -- GitLab