diff --git a/getting_started/04_numpy.ipynb b/getting_started/04_numpy.ipynb index f0a4507d5d2301279f8e08cfee175a50a06eeeed..e877863b0d6105f54087da9a1759aab2a4cc8b31 100644 --- a/getting_started/04_numpy.ipynb +++ b/getting_started/04_numpy.ipynb @@ -14,18 +14,45 @@ "Numpy is not actually part of the standard Python library. But it is a\n", "fundamental part of the Python ecosystem - it forms the basis for many\n", "important Python libraries, and it (along with its partners\n", - "[`scipy`](https://www.scipy.org/) and [`matplotlib`](https://matplotlib.org/))\n", - "is what makes Python a viable alternative to Matlab as a scientific computing\n", - "platform.\n", + "[`scipy`](https://www.scipy.org/), [`matplotlib`](https://matplotlib.org/) and\n", + "[`pandas`](https://pandas.pydata.org/)) is what makes Python an attractive\n", + "alternative to Matlab as a scientific computing platform.\n", + "\n", + "\n", + "The `fslpython` environment currently includes [Numpy\n", + "1.11.1](https://docs.scipy.org/doc/numpy-1.11.0/index.html), which is a little\n", + "out of date, but we will update it for the next release of FSL.\n", "\n", "\n", "## Contents\n", "\n", "\n", "* [The Python list versus the Numpy array](#the-python-list-versus-the-numpy-array)\n", - "* [Importing Numpy](#importing-numpy)\n", "* [Numpy basics](#numpy-basics)\n", - "* [Indexing](#indexing)\n", + " * [Creating arrays](#creating-arrays)\n", + " * [Loading text files](#loading-text-files)\n", + " * [Array properties](#array-properties)\n", + " * [Descriptive statistics](#descriptive-statistics)\n", + " * [Reshaping and rearranging arrays](#reshaping-and-rearranging-arrays)\n", + "* [Operating on arrays](#operating-on-arrays)\n", + " * [Scalar operations](#scalar-operations)\n", + " * [Multi-variate operations](#multi-variate-operations)\n", + " * [Matrix multplication](#matrix-multiplication)\n", + " * [Broadcasting](#broadcasting)\n", + " * [Linear algebra](#linear-algebra)\n", + "* [Array indexing](#array-indexing)\n", + " * [Indexing multi-dimensional arrays](#indexing-multi-dimensional-arrays)\n", + " * [Boolean indexing](#boolean-indexing)\n", + " * [Coordinate array indexing](#coordinate-array-indexing)\n", + "* [Exercises](#exercises)\n", + " * [Load an array from a file and do stuff with it](#load-an-array-from-a-file-and-do-stuff-with-it)\n", + " * [Concatenate affine transforms](#concatenate-affine-transforms)\n", + "\n", + "* [Appendix A: Generating random numbers](#appendix-generating-random-numbers)\n", + "* [Appendix B: Importing Numpy](#appendix-importing-numpy)\n", + "* [Appendix C: Vectors in Numpy](#appendix-vectors-in-numpy)\n", + "\n", + "* [Useful references](#useful-references)\n", "\n", "\n", "<a class=\"anchor\" id=\"the-python-list-versus-the-numpy-array\"></a>\n", @@ -33,9 +60,12 @@ "\n", "\n", "Numpy adds a new data type to the Python language - the `array` (more\n", - "specifically, the `ndarray`). You have already been introduced to the Python\n", - "`list`, which you can easily use to store a handful of numbers (or anything\n", - "else):" + "specifically, the `ndarray`). A Numpy `array` is a N-dimensional array of\n", + "homogeneously-typed numerical data.\n", + "\n", + "\n", + "You have already been introduced to the Python `list`, which you can easily\n", + "use to store a handful of numbers (or anything else):" ] }, { @@ -60,7 +90,9 @@ "metadata": {}, "outputs": [], "source": [ - "xyz_coords = [[-11.4, 1.0, 22.6], [22.7, -32.8, 19.1], [62.8, -18.2, -34.5]]" + "xyz_coords = [[-11.4, 1.0, 22.6],\n", + " [ 22.7, -32.8, 19.1],\n", + " [ 62.8, -18.2, -34.5]]" ] }, { @@ -78,21 +110,20 @@ "computing!\n", "\n", "\n", - "This is a major source of confusion for those poor souls who have spent their\n", - "lives working in Matlab, but have finally seen the light and switched to\n", - "Python. It is very important to be able to distinguish between a Python list,\n", - "and a Numpy array.\n", + "This is a major source of confusion for people who are learning Python, and\n", + "are trying to write efficient code. It is _crucial_ to be able to distinguish\n", + "between a Python list and a Numpy array.\n", "\n", "\n", - "A list in python is akin to a cell array in Matlab - they can store anything,\n", - "but are extremely inefficient, and unwieldy when you have more than a couple\n", - "of dimensions.\n", + "___Python list == Matlab cell array:___ A list in Python is akin to a cell\n", + "array in Matlab - they can store anything, but are extremely inefficient, and\n", + "unwieldy when you have more than a couple of dimensions.\n", "\n", "\n", - "These are in contrast to the Numpy array and Matlab matrix, which are both\n", - "thin wrappers around a contiguous chunk of memory, and which provide\n", - "blazing-fast performance (because behind the scenes in both Numpy and Matlab,\n", - "it's C, C++ and FORTRAN all the way down).\n", + "___Numy array == Matlab matrix:___ These are in contrast to the Numpy array\n", + "and Matlab matrix, which are both thin wrappers around a contiguous chunk of\n", + "memory, and which provide blazing-fast performance (because behind the scenes\n", + "in both Numpy and Matlab, it's C, C++ and FORTRAN all the way down).\n", "\n", "\n", "So you should strongly consider turning those lists into Numpy arrays:" @@ -119,8 +150,7 @@ "source": [ "If you look carefully at the code above, you will notice that we are still\n", "actually using Python lists. We have declared our data sets in exactly the\n", - "same way that we did earlier, by denoting them with square brackets `[` and\n", - "`]`.\n", + "same way as we did earlier, by denoting them with square brackets `[` and `]`.\n", "\n", "\n", "The key difference here is that these lists immediately get converted into\n", @@ -151,16 +181,19 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "I'm emphasising this to help you understand the difference between Python\n", - "lists and Numpy arrays. Apologies if you've already got it.\n", + "Of course, in practice, we would never create a Numpy array in this way - we\n", + "will be loading our data from text or binary files directly into a Numpy\n", + "array, thus completely bypassing the use of Python lists and the costly\n", + "list-to-array conversion. I'm emphasising this to help you understand the\n", + "difference between Python lists and Numpy arrays. Apologies if you've already\n", + "got it, forgiveness please.\n", "\n", "\n", - "<a class=\"anchor\" id=\"importing-numpy\"></a>\n", - "## Importing numpy\n", + "<a class=\"anchor\" id=\"numpy-basics\"></a>\n", + "## Numpy basics\n", "\n", "\n", - "For interactive exploration/experimentation, you might want to import\n", - "Numpy like this:" + "Let's get started." ] }, { @@ -169,15 +202,19 @@ "metadata": {}, "outputs": [], "source": [ - "from numpy import *" + "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "This makes your Python session very similar to Matlab - you can call all\n", - "of the Numpy functions directly:" + "<a class=\"anchor\" id=\"creating-arrays\"></a>\n", + "### Creating arrays\n", + "\n", + "\n", + "Numpy has quite a few functions which behave similarly to their equivalents in\n", + "Matlab:" ] }, { @@ -186,21 +223,56 @@ "metadata": {}, "outputs": [], "source": [ - "e = array([1, 2, 3, 4, 5])\n", - "z = zeros((100, 100))\n", - "d = diag([2, 3, 4, 5])\n", + "print('np.zeros gives us zeros: ', np.zeros(5))\n", + "print('np.ones gives us ones: ', np.ones(5))\n", + "print('np.arange gives us a range: ', np.arange(5))\n", + "print('np.linspace gives us N linearly spaced numbers:', np.linspace(0, 1, 5))\n", + "print('np.random.random gives us random numbers [0-1]:', np.random.random(5))\n", + "print('np.random.randint gives us random integers: ', np.random.randint(1, 10, 5))\n", + "print('np.eye gives us an identity matrix:')\n", + "print(np.eye(4))\n", + "print('np.diag gives us a diagonal matrix:')\n", + "print(np.diag([1, 2, 3, 4]))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> See the [appendix](#appendix-generating-random-numbers) for more information\n", + "> on generating random numbers in Numpy.\n", "\n", - "print(e)\n", + "\n", + "The `zeros` and `ones` functions can also be used to generate N-dimensional\n", + "arrays:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "z = np.zeros((3, 4))\n", + "o = np.ones((2, 10))\n", "print(z)\n", - "print(d)" + "print(o)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "But if you are writing a script or application using Numpy, I implore you to\n", - "Numpy like this instead:" + "> Note that, in a 2D Numpy array, the first axis corresponds to rows, and the\n", + "> second to columns - just like in Matlab.\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"loading-text-files\"></a>\n", + "### Loading text files\n", + "\n", + "\n", + "The `numpy.loadtxt` function is capable of loading numerical data from\n", + "plain-text files. By default it expects space-separated data:" ] }, { @@ -209,15 +281,16 @@ "metadata": {}, "outputs": [], "source": [ - "import numpy as np" + "data = np.loadtxt('04_numpy/space_separated.txt')\n", + "print('data in 04_numpy/space_separated.txt:')\n", + "print(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "The downside to this is that you will have to prefix all Numpy functions with\n", - "`np.`, like so:" + "But you can also specify the delimiter to expect<sup>1</sup>:" ] }, { @@ -226,30 +299,84 @@ "metadata": {}, "outputs": [], "source": [ - "e = np.array([1, 2, 3, 4, 5])\n", - "z = np.zeros((100, 100))\n", - "d = np.diag([2, 3, 4, 5])\n", + "data = np.loadtxt('04_numpy/comma_separated.txt', delimiter=',')\n", + "print('data in 04_numpy/comma_separated.txt:')\n", + "print(data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> <sup>1</sup> And many other things such as file headers, footers, comments,\n", + "> and newline characters - see the\n", + "> [docs](https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html)\n", + "> for more information.\n", "\n", - "print(e)\n", + "\n", + " Of course you can also save data out to a text file just as easily, with\n", + "[`numpy.savetxt`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "data = np.random.randint(1, 10, (10, 10))\n", + "np.savetxt('mydata.txt', data, delimiter=',', fmt='%i')\n", + "\n", + "with open('mydata.txt', 'rt') as f:\n", + " for line in f:\n", + " print(line.strip())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a class=\"anchor\" id=\"array-properties\"></a>\n", + "### Array properties\n", + "\n", + "\n", + "Numpy is a bit different than Matlab in the way that you interact with\n", + "arrays. In Matlab, you would typically pass an array to a built-in function,\n", + "e.g. `size(M)`, `ndims(M)`, etc. In contrast, a Numpy array is a Python\n", + "object which has _attributes_ that contain basic information about the array:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "z = np.zeros((2, 3, 4))\n", "print(z)\n", - "print(d)" + "print('Shape: ', z.shape)\n", + "print('Number of dimensions: ', z.ndim)\n", + "print('Number of elements: ', z.size)\n", + "print('Data type: ', z.dtype)\n", + "print('Number of bytes: ', z.nbytes)\n", + "print('Length of first dimension: ', len(z))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "There is a big upside, however, in that other people who have to read/use your\n", - "code will like you a lot more. This is because it will be easier for them to\n", - "figure out what the hell your code is doing. Namespaces are your friend - use\n", - "them!\n", + "> As depicted above, passing a Numpy array to the built-in `len` function will\n", + "> only give you the length of the first dimension, so you will typically want\n", + "> to avoid using it - use the `size` attribute instead.\n", "\n", "\n", - "<a class=\"anchor\" id=\"numpy-basics\"></a>\n", - "## Numpy basics\n", + "<a class=\"anchor\" id=\"descriptive-statistics\"></a>\n", + "### Descriptive statistics\n", "\n", "\n", - "Let's get started." + "Similarly, a Numpy array has a set of methods<sup>2</sup> which allow you to\n", + "calculate basic descriptive statistics on an array:" ] }, { @@ -258,15 +385,1039 @@ "metadata": {}, "outputs": [], "source": [ - "import numpy as np" + "a = np.random.random(10)\n", + "print('a: ', a)\n", + "print('min: ', a.min())\n", + "print('max: ', a.max())\n", + "print('index of min: ', a.argmin()) # remember that in Python, list indices\n", + "print('index of max: ', a.argmax()) # start from zero - Numpy is the same!\n", + "print('mean: ', a.mean())\n", + "print('variance: ', a.var())\n", + "print('stddev: ', a.std())\n", + "print('sum: ', a.sum())\n", + "print('prod: ', a.prod())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> <sup>2</sup> Python, being an object-oriented language, distinguishes\n", + "> between _functions_ and _methods_. Hopefully we all know what a function is\n", + "> - a _method_ is simply the term used to refer to a function that is\n", + "> associated with a specific object. Similarly, the term _attribute_ is used\n", + "> to refer to some piece of information that is attached to an object, such as\n", + "> `z.shape`, or `z.dtype`.\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"reshaping-and-rearranging-arrays\"></a>\n", + "### Reshaping and rearranging arrays\n", + "\n", + "\n", + "A numpy array can be reshaped very easily, using the `reshape` method." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(16).reshape(4, 4)\n", + "b = a.reshape((2, 8))\n", + "print('a:')\n", + "print(a)\n", + "print('b:')\n", + "print(b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that this does not modify the underlying data in any way - the `reshape`\n", + "method returns a _view_ of the same array, just indexed differently:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a[3, 3] = 12345\n", + "b[0, 7] = 54321\n", + "print('a:')\n", + "print(a)\n", + "print('b:')\n", + "print(b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you need to create a reshaped copy of an array, use the `np.array`\n", + "function:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(16).reshape((4, 4))\n", + "b = np.array(a.reshape(2, 8))\n", + "a[3, 3] = 12345\n", + "b[0, 7] = 54321\n", + "print('a:')\n", + "print(a)\n", + "print('b:')\n", + "print(b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `T` attribute is a shortcut to obtain the transpose of an array." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(16).reshape((4, 4))\n", + "print(a)\n", + "print(a.T)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `transpose` method allows you to obtain more complicated rearrangements\n", + "of an N-dimensional array's axes:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(24).reshape((2, 3, 4))\n", + "b = a.transpose((2, 0, 1))\n", + "print('a: ', a.shape)\n", + "print(a)\n", + "print('b:', b.shape)\n", + "print(b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> Note again that the `T` attribute and `transpose` method return _views_ of\n", + "> your array.\n", + "\n", + "\n", + "Numpy has some useful functions which allow you to concatenate or stack\n", + "multiple arrays into one. The `concatenate` function does what it says on the\n", + "tin:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.zeros(3)\n", + "b = np.ones(3)\n", + "\n", + "print('1D concatenation:', np.concatenate((a, b)))\n", + "\n", + "a = np.zeros((3, 3))\n", + "b = np.ones((3, 3))\n", + "\n", + "print('2D column-wise concatenation:')\n", + "print(np.concatenate((a, b), axis=1))\n", + "\n", + "print('2D row-wise concatenation:')\n", + "\n", + "# The axis parameter defaults to 0,\n", + "# so it is not strictly necessary here.\n", + "print(np.concatenate((a, b), axis=0))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `hstack`, `vstack` and `dstack` functions allow you to concatenate vectors\n", + "or arrays along the first, second, or third dimension respectively:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.zeros(3)\n", + "b = np.ones(3)\n", + "\n", + "print('a: ', a)\n", + "print('b: ', b)\n", + "\n", + "hstacked = np.hstack((a, b))\n", + "vstacked = np.vstack((a, b))\n", + "dstacked = np.dstack((a, b))\n", + "\n", + "print('hstacked: (shape {}):'.format(hstacked.shape))\n", + "print( hstacked)\n", + "print('vstacked: (shape {}):'.format(vstacked.shape))\n", + "print( vstacked)\n", + "print('dstacked: (shape {}):'.format(dstacked.shape))\n", + "print( dstacked)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a class=\"anchor\" id=\"operating-on-arrays\"></a>\n", + "## Operating on arrays\n", + "\n", + "\n", + "If you are coming from Matlab, you should read this section as, while many\n", + "Numpy operations behave similarly to Matlab, there are a few important\n", + "behaviours which differ from what you might expect.\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"scalar-operations\"></a>\n", + "### Scalar operations\n", + "\n", + "\n", + "All of the mathematical operators you know and love can be applied to Numpy\n", + "arrays:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(1, 10).reshape((3, 3))\n", + "print('a:')\n", + "print(a)\n", + "print('a + 2:')\n", + "print( a + 2)\n", + "print('a * 3:')\n", + "print( a * 3)\n", + "print('a % 2:')\n", + "print( a % 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "<a class=\"anchor\" id=\"indexing\"></a>\n", - "## Indexing" + "<a class=\"anchor\" id=\"multi-variate-operations\"></a>\n", + "### Multi-variate operations\n", + "\n", + "\n", + "Many operations in Numpy operate on an element-wise basis. For example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.ones(5)\n", + "b = np.random.randint(1, 10, 5)\n", + "\n", + "print('a: ', a)\n", + "print('b: ', b)\n", + "print('a + b: ', a + b)\n", + "print('a * b: ', a * b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This also extends to higher dimensional arrays:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.ones((4, 4))\n", + "b = np.arange(16).reshape((4, 4))\n", + "\n", + "print('a:')\n", + "print(a)\n", + "print('b:')\n", + "print(b)\n", + "\n", + "print('a + b')\n", + "print(a + b)\n", + "print('a * b')\n", + "print(a * b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Wait ... what's that you say? Oh, I couldn't understand because of all the\n", + "froth coming out of your mouth. I guess you're angry that `a * b` didn't give\n", + "you the matrix product, like it would have in Matlab. Well all I can say is\n", + "that Numpy is not Matlab. Matlab operations are typically consistent with\n", + "linear algebra notation. This is not the case in Numpy. Get over it. Take a\n", + "calmative.\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"matrix-multiplication\"></a>\n", + "### Matrix multiplication\n", + "\n", + "\n", + "When your heart rate has returned to its normal caffeine-induced state, you\n", + "can use the `@` operator or the `dot` method, to perform matrix\n", + "multiplication:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(1, 5).reshape((2, 2))\n", + "b = a.T\n", + "\n", + "print('a:')\n", + "print(a)\n", + "print('b:')\n", + "print(b)\n", + "\n", + "print('a @ b')\n", + "print(a @ b)\n", + "\n", + "print('a.dot(b)')\n", + "print(a.dot(b))\n", + "\n", + "print('b.dot(a)')\n", + "print(b.dot(a))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `@` matrix multiplication operator is a relatively recent addition to\n", + "> Python and Numpy, so you might not see it all that often in existing\n", + "> code. But it's here to stay, so if you don't need to worry about\n", + "> backwards-compatibility, go ahead and use it!\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"broadcasting\"></a>\n", + "### Broadcasting\n", + "\n", + "\n", + "One of the coolest features of Numpy is _broadcasting_<sup>3</sup>.\n", + "Broadcasting allows you to perform element-wise operations on arrays which\n", + "have a different shape. For each axis in the two arrays, Numpy will implicitly\n", + "expand the shape of the smaller axis to match the shape of the larger one. You\n", + "never need to use `repmat` ever again!\n", + "\n", + "\n", + "> <sup>3</sup>Mathworks have shamelessly stolen Numpy's broadcasting behaviour\n", + "> and included it in Matlab versions from 2016b onwards, referring to it as\n", + "> _implicit expansion_.\n", + "\n", + "\n", + "Broadcasting allows you to, for example, add the elements of a 1D vector to\n", + "all of the rows or columns of a 2D array:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(9).reshape((3, 3))\n", + "b = np.arange(1, 4)\n", + "print('a:')\n", + "print(a)\n", + "print('b: ', b)\n", + "print('a * b (row-wise broadcasting):')\n", + "print(a * b)\n", + "print('a * b.T (column-wise broadcasting):')\n", + "print(a * b.reshape(-1, 1))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> Here we used a handy feature of the `reshape` method - if you pass `-1` for\n", + "> the size of one dimension, it will automatically determine the size to use\n", + "> for that dimension. Take a look at [the\n", + "> appendix](#appendix-vectors-in-numpy) for a discussion on vectors in Numpy.\n", + "\n", + "\n", + "Here is a more useful example, where we use broadcasting to de-mean the rows\n", + "or columns of an array:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(9).reshape((3, 3))\n", + "print('a:')\n", + "print(a)\n", + "\n", + "print('a (cols demeaned):')\n", + "print(a - a.mean(axis=0))\n", + "\n", + "print('a (rows demeaned):')\n", + "print(a - a.mean(axis=1).reshape(-1, 1))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> As demonstrated above, many functions in Numpy accept an `axis` parameter,\n", + "> allowing you to apply the function along a specific axis. Omitting the\n", + "> `axis` parameter will apply the function to the whole array.\n", + "\n", + "\n", + "Broadcasting can sometimes be confusing, but the rules which Numpy follows to\n", + "align arrays of different sizes, and hence determine how the broadcasting\n", + "should be applied, are pretty straightforward. If something is not working,\n", + "and you can't figure out why refer to the [official\n", + "documentation](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html).\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"linear-algebra\"></a>\n", + "### Linear algebra\n", + "\n", + "\n", + "Numpy is first and foremost a library for general-purpose numerical computing.\n", + "But it does have a range of linear algebra functionality, hidden away in the\n", + "[`numpy.linalg`](https://docs.scipy.org/doc/numpy/reference/routines.linalg.html)\n", + "module. Here are a couple of quick examples:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy.linalg as npla\n", + "\n", + "a = np.array([[1, 2, 3, 4],\n", + " [0, 5, 6, 7],\n", + " [0, 0, 8, 9],\n", + " [0, 0, 0, 10]])\n", + "\n", + "print('a:')\n", + "print(a)\n", + "\n", + "print('inv(a)')\n", + "print(npla.inv(a))\n", + "\n", + "eigvals, eigvecs = npla.eig(a)\n", + "\n", + "print('eigenvalues and vectors of a:')\n", + "for val, vec in zip(eigvals, eigvecs):\n", + " print('{:2.0f} - {}'.format(val, vec))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a class=\"anchor\" id=\"array-indexing\"></a>\n", + "## Array indexing\n", + "\n", + "\n", + "Just like in Matlab, slicing up your arrays is a breeze in Numpy. If you are\n", + "after some light reading, you might want to check out the [comprehensive Numpy\n", + "Indexing\n", + "reference](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html).\n", + "\n", + "\n", + "> As with indexing regular Python lists, array indices start from 0, and end\n", + "> indices (if specified) are exclusive.\n", + "\n", + "\n", + "Let's whet our appetites with some basic 1D array slicing. Numpy supports the\n", + "standard Python\n", + "[__slice__](https://www.pythoncentral.io/how-to-slice-listsarrays-and-tuples-in-python/)\n", + "notation for indexing, where you can specify the start and end indices, and\n", + "the step size, via the `start:stop:step` syntax:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(10)\n", + "\n", + "print('a: ', a)\n", + "print('first element: ', a[0])\n", + "print('first two elements: ', a[:2])\n", + "print('last element: ', a[a.shape[0] - 1])\n", + "print('last element again: ', a[-1])\n", + "print('last two elements: ', a[-2:])\n", + "print('middle four elements: ', a[3:7])\n", + "print('Every second element: ', a[1::2])\n", + "print('Every second element, reversed: ', a[-1::-2])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that slicing an array in this way returns a _view_, not a copy, into that\n", + "array:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(10)\n", + "print('a:', a)\n", + "every2nd = a[::2]\n", + "print('every 2nd:', every2nd)\n", + "every2nd += 10\n", + "print('a:', a)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a class=\"anchor\" id=\"indexing-multi-dimensional-arrays\"></a>\n", + "### Indexing multi-dimensional arrays\n", + "\n", + "\n", + "Multi-dimensional array indexing works in much the same way as one-dimensional\n", + "indexing but with, well, more dimensions. Use commas within the square\n", + "brackets to separate the slices for each dimension:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(25).reshape((5, 5))\n", + "print('a:')\n", + "print(a)\n", + "print(' First row: ', a[ 0, :])\n", + "print(' Last row: ', a[ -1, :])\n", + "print(' second column: ', a[ :, 1])\n", + "print(' Centre:')\n", + "print(a[1:4, 1:4])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For arrays with more than two dimensions, the ellipsis (`...`) is a handy\n", + "feature - it allows you to specify a slice comprising all elements along\n", + "more than one dimension:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(27).reshape((3, 3, 3))\n", + "print('a:')\n", + "print(a)\n", + "print('All elements at x=0:')\n", + "print(a[0, ...])\n", + "print('All elements at z=2:')\n", + "print(a[..., 2])\n", + "print('All elements at x=0, z=2:')\n", + "print(a[0, ..., 2])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a class=\"anchor\" id=\"boolean-indexing\"></a>\n", + "### Boolean indexing\n", + "\n", + "\n", + "A numpy array can be indexed with a boolean array of the same shape. For\n", + "example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(10)\n", + "\n", + "print('a: ', a)\n", + "print('a > 5: ', a > 4)\n", + "print('elements in a that are > 5: ', a[a > 5])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In contrast to the simple indexing we have already seen, boolean indexing will\n", + "return a _copy_ of the indexed data, __not__ a view. For example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(10)\n", + "b = a[a > 5]\n", + "print('a: ', a)\n", + "print('b: ', b)\n", + "print('Setting b[0] to 999')\n", + "b[0] = 999\n", + "print('a: ', a)\n", + "print('b: ', b)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> In general, any 'simple' indexing operation on a Numpy array, where the\n", + "> indexing object comprises integers, slices (using the standard Python\n", + "> `start:stop:step` notation), colons (`:`) and/or ellipses (`...`), will\n", + "> result in a __view__ into the indexed array. Any 'advanced' indexing\n", + "> operation, where the indexing object contains anything else (e.g. boolean or\n", + "> integer arrays, or even python lists), will result in a __copy__ of the\n", + "> data.\n", + "\n", + "\n", + "Logical operators `~` (not), `&` (and) and `|` (or) can be used to manipulate\n", + "and combine boolean Numpy arrays:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(10)\n", + "gt5 = a > 5\n", + "even = a % 2 == 0\n", + "\n", + "print('a: ', a)\n", + "print('elements in a which are > 5: ', a[ gt5])\n", + "print('elements in a which are <= 5: ', a[~gt5])\n", + "print('elements in a which are even: ', a[ even])\n", + "print('elements in a which are odd: ', a[~even])\n", + "print('elements in a which are > 5 and even: ', a[gt5 & even])\n", + "print('elements in a which are > 5 or odd: ', a[gt5 | ~even])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Numpy also has two handy functions, `all` and `any`, which respectively allow\n", + "you to perform boolean `and` and `or` operations along the axes of an array:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(9).reshape((3, 3))\n", + "\n", + "print('a:')\n", + "print(a)\n", + "print('rows with any element divisible by 3: ', np.any(a % 3 == 0, axis=1))\n", + "print('cols with any element divisible by 3: ', np.any(a % 3 == 0, axis=0))\n", + "print('rows with all elements divisible by 3:', np.all(a % 3 == 0, axis=1))\n", + "print('cols with all elements divisible by 3:', np.all(a % 3 == 0, axis=0))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a class=\"anchor\" id=\"coordinate-array-indexing\"></a>\n", + "### Coordinate array indexing\n", + "\n", + "\n", + "You can index a numpy array using another array containing coordinates into\n", + "the first array. As with boolean indexing, this will result in a copy of the\n", + "data. Generally, you will need to have a separate array, or list, of\n", + "coordinates for each axis of the array you wish to index:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(16).reshape((4, 4))\n", + "print('a:')\n", + "print(a)\n", + "\n", + "rows = [0, 2, 3]\n", + "cols = [1, 0, 2]\n", + "indexed = a[rows, cols]\n", + "\n", + "for r, c, v in zip(rows, cols, indexed):\n", + " print('a[{}, {}] = {}'.format(r, c, v))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `numpy.where` function can be combined with boolean arrays to easily\n", + "generate of coordinate arrays for values which meet some condition:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "a = np.arange(16).reshape((4, 4))\n", + "print('a:')\n", + "print(a)\n", + "\n", + "evenrows, evencols = np.where(a % 2 == 0)\n", + "\n", + "print('even row coordinates:', evenx)\n", + "print('even col coordinates:', eveny)\n", + "\n", + "print(a[evenrows, evencols])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a class=\"anchor\" id=\"exercises\"></a>\n", + "## Exercises\n", + "\n", + "\n", + "The challenge for each of these exercises is to complete them in as few lines\n", + "of code as possible!\n", + "\n", + "\n", + "> You can find example answers to the exercises in `04_numpy/.solutions`.\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"load-an-array-from-a-file-and-do-stuff-with-it\"></a>\n", + "### Load an array from a file and do stuff with it\n", + "\n", + "\n", + "Load the file `04_numpy/2d_array.txt`, and calculate and print the mean for\n", + "each column. If your code doesn't work, you might want to __LOOK AT YOUR\n", + "DATA__, as you will have learnt if you have ever attended the FSL course.\n", + "\n", + "\n", + "> Bonus: Find the hidden message (hint:\n", + "> [chr](https://docs.python.org/3/library/functions.html#chr))\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"concatenate-affine-transforms\"></a>\n", + "### Concatenate affine transforms\n", + "\n", + "\n", + "Given all of the files in `04_numpy/xfms/`, create a transformation matrix\n", + "which can transform coordinates from subject 1 functional space to subject 2\n", + "functional space<sup>4</sup>.\n", + "\n", + "Write some code to transform the following coordinates from subject 1\n", + "functional space to subject 2 functional space, to test that your matrix is\n", + "correct:\n", + "\n", + "\n", + "| __Subject 1 coordinates__ | __Subject 2 coordinates__ |\n", + "|:-------------------------:|:-------------------------:|\n", + "| `[ 0.0, 0.0, 0.0]` | `[ 3.21, 4.15, -9.89]` |\n", + "| `[-5.0, -20.0, 10.0]` | `[-0.87, -14.36, -1.13]` |\n", + "| `[20.0, 25.0, 60.0]` | `[29.58, 27.61, 53.37]` |\n", + "\n", + "\n", + "> <sup>4</sup> Even though these are FLIRT transforms, this is just a toy\n", + "> example. Look\n", + "> [here](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FLIRT/FAQ#What_is_the_format_of_the_matrix_used_by_FLIRT.2C_and_how_does_it_relate_to_the_transformation_parameters.3F)\n", + "> and\n", + "> [here](https://git.fmrib.ox.ac.uk/fsl/fslpy/blob/1.6.2/fsl/utils/transform.py#L537)\n", + "> if you actually need to work with FLIRT transforms.\n", + "\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"appendix-generating-random-numbers\"></a>\n", + "## Appendix A: Generating random numbers\n", + "\n", + "\n", + "Numpy's\n", + "[`numpy.random`](https://docs.scipy.org/doc/numpy/reference/routines.random.html)\n", + "module is where you should go if you want to introduce a little randomness\n", + "into your code. You have already seen a couple of functions for generating\n", + "uniformly distributed real or integer data:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy.random as npr\n", + "\n", + "print('Random floats between 0.0 (inclusive) and 1.0 (exclusive):')\n", + "print(npr.random((3, 3)))\n", + "\n", + "print('Random integers in a specified range:')\n", + "print(npr.randint(1, 100, (3, 3)))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can also draw random data from other distributions - here are just a few\n", + "examples:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print('Gaussian (mean: 0, stddev: 1):')\n", + "print(npr.normal(0, 1, (3, 3)))\n", + "\n", + "print('Gamma (shape: 1, scale: 1):')\n", + "print(npr.normal(1, 1, (3, 3)))\n", + "\n", + "print('Chi-square (dof: 10):')\n", + "print(npr.chisquare(10, (3, 3)))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `numpy.random` module also has a couple of other handy functions for\n", + "random sampling of existing data:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "data = np.arange(5)\n", + "\n", + "print('data: ', data)\n", + "print('two random values: ', npr.choice(data, 2))\n", + "print('random permutation: ', npr.permutation(data))\n", + "\n", + "# The numpy.random.shuffle function\n", + "# will shuffle an array *in-place*.\n", + "npr.shuffle(data)\n", + "print('randomly shuffled: ', data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a class=\"anchor\" id=\"appendix-importing-numpy\"></a>\n", + "## Appendix B: Importing Numpy\n", + "\n", + "\n", + "For interactive exploration/experimentation, you might want to import\n", + "Numpy like this:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from numpy import *" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This makes your Python session very similar to Matlab - you can call all\n", + "of the Numpy functions directly:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "e = array([1, 2, 3, 4, 5])\n", + "z = zeros((100, 100))\n", + "d = diag([2, 3, 4, 5])\n", + "\n", + "print(e)\n", + "print(z)\n", + "print(d)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "But if you are writing a script or application using Numpy, I implore you to\n", + "Numpy (and its commonly used sub-modules) like this instead:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import numpy.random as npr\n", + "import numpy.linalg as npla" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The downside to this is that you will have to prefix all Numpy functions with\n", + "`np.`, like so:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "e = np.array([1, 2, 3, 4, 5])\n", + "z = np.zeros((100, 100))\n", + "d = np.diag([2, 3, 4, 5])\n", + "r = npr.random(5)\n", + "\n", + "print(e)\n", + "print(z)\n", + "print(d)\n", + "print(r)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There is a big upside, however, in that other people who have to read/use your\n", + "code will like you a lot more. This is because it will be easier for them to\n", + "figure out what the hell your code is doing. Namespaces are your friend - use\n", + "them!\n", + "\n", + "\n", + "<a class=\"anchor\" id=\"appendix-vectors-in-numpy\"></a>\n", + "## Appendix C: Vectors in Numpy\n", + "\n", + "\n", + "One aspect of Numpy which might trip you up, and which can be quite\n", + "frustrating at times, is that Numpy has no understanding of row or column\n", + "vectors. __An array with only one dimension is neither a row, nor a column\n", + "vector - it is just a 1D array__. If you have a 1D array, and you want to use\n", + "it as a row vector, you need to reshape it to a shape of `(1, N)`. Similarly,\n", + "to use a 1D array as a column vector, you must reshape it to have shape\n", + "`(N, 1)`.\n", + "\n", + "\n", + "In general, when you are mixing 1D arrays with 2- or N-dimensional arrays, you\n", + "need to make sure that your arrays have the correct shape. For example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "r = np.random.randint(1, 10, 3)\n", + "\n", + "print('r is a row: ', r)\n", + "print('r.T should be a column: ', r.T, ' ... huh?')\n", + "print('Ok, make n a 2D array with one row: ', r.reshape(1, -1))\n", + "print('We could also use the np.atleast_2d function:', np.atleast_2d(r))\n", + "print('Now we can transpose r to get a column:')\n", + "print(np.atleast_2d(r).T)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a class=\"anchor\" id=\"useful-references\"></a>\n", + "## Useful references\n", + "\n", + "\n", + "* [The Numpy manual](https://docs.scipy.org/doc/numpy/)\n", + "* [Linear algebra in `numpy.linalg`](https://docs.scipy.org/doc/numpy/reference/routines.linalg.html)\n", + "* [Broadcasting in Numpy](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)\n", + "* [Indexing in Numpy](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)\n", + "* [Random sampling in `numpy.random`](https://docs.scipy.org/doc/numpy/reference/routines.random.html)\n", + "* [Python slicing](https://www.pythoncentral.io/how-to-slice-listsarrays-and-tuples-in-python/)" ] } ], diff --git a/getting_started/04_numpy.md b/getting_started/04_numpy.md index d395b6a9deafe0210951534428d7ada01cb80d23..dce3532bb6e37ba55f749cd3f7d25cf97878c7da 100644 --- a/getting_started/04_numpy.md +++ b/getting_started/04_numpy.md @@ -8,18 +8,45 @@ numerical computing library. Numpy is not actually part of the standard Python library. But it is a fundamental part of the Python ecosystem - it forms the basis for many important Python libraries, and it (along with its partners -[`scipy`](https://www.scipy.org/) and [`matplotlib`](https://matplotlib.org/)) -is what makes Python a viable alternative to Matlab as a scientific computing -platform. +[`scipy`](https://www.scipy.org/), [`matplotlib`](https://matplotlib.org/) and +[`pandas`](https://pandas.pydata.org/)) is what makes Python an attractive +alternative to Matlab as a scientific computing platform. + + +The `fslpython` environment currently includes [Numpy +1.11.1](https://docs.scipy.org/doc/numpy-1.11.0/index.html), which is a little +out of date, but we will update it for the next release of FSL. ## Contents * [The Python list versus the Numpy array](#the-python-list-versus-the-numpy-array) -* [Importing Numpy](#importing-numpy) * [Numpy basics](#numpy-basics) -* [Indexing](#indexing) + * [Creating arrays](#creating-arrays) + * [Loading text files](#loading-text-files) + * [Array properties](#array-properties) + * [Descriptive statistics](#descriptive-statistics) + * [Reshaping and rearranging arrays](#reshaping-and-rearranging-arrays) +* [Operating on arrays](#operating-on-arrays) + * [Scalar operations](#scalar-operations) + * [Multi-variate operations](#multi-variate-operations) + * [Matrix multplication](#matrix-multiplication) + * [Broadcasting](#broadcasting) + * [Linear algebra](#linear-algebra) +* [Array indexing](#array-indexing) + * [Indexing multi-dimensional arrays](#indexing-multi-dimensional-arrays) + * [Boolean indexing](#boolean-indexing) + * [Coordinate array indexing](#coordinate-array-indexing) +* [Exercises](#exercises) + * [Load an array from a file and do stuff with it](#load-an-array-from-a-file-and-do-stuff-with-it) + * [Concatenate affine transforms](#concatenate-affine-transforms) + +* [Appendix A: Generating random numbers](#appendix-generating-random-numbers) +* [Appendix B: Importing Numpy](#appendix-importing-numpy) +* [Appendix C: Vectors in Numpy](#appendix-vectors-in-numpy) + +* [Useful references](#useful-references) <a class="anchor" id="the-python-list-versus-the-numpy-array"></a> @@ -27,9 +54,12 @@ platform. Numpy adds a new data type to the Python language - the `array` (more -specifically, the `ndarray`). You have already been introduced to the Python -`list`, which you can easily use to store a handful of numbers (or anything -else): +specifically, the `ndarray`). A Numpy `array` is a N-dimensional array of +homogeneously-typed numerical data. + + +You have already been introduced to the Python `list`, which you can easily +use to store a handful of numbers (or anything else): ``` @@ -41,7 +71,9 @@ You could also emulate a 2D or ND matrix by using lists of lists, for example: ``` -xyz_coords = [[-11.4, 1.0, 22.6], [22.7, -32.8, 19.1], [62.8, -18.2, -34.5]] +xyz_coords = [[-11.4, 1.0, 22.6], + [ 22.7, -32.8, 19.1], + [ 62.8, -18.2, -34.5]] ``` @@ -56,21 +88,20 @@ But __BEWARE!__ A Python list is a terrible data structure for scientific computing! -This is a major source of confusion for those poor souls who have spent their -lives working in Matlab, but have finally seen the light and switched to -Python. It is very important to be able to distinguish between a Python list, -and a Numpy array. +This is a major source of confusion for people who are learning Python, and +are trying to write efficient code. It is _crucial_ to be able to distinguish +between a Python list and a Numpy array. -A list in python is akin to a cell array in Matlab - they can store anything, -but are extremely inefficient, and unwieldy when you have more than a couple -of dimensions. +___Python list == Matlab cell array:___ A list in Python is akin to a cell +array in Matlab - they can store anything, but are extremely inefficient, and +unwieldy when you have more than a couple of dimensions. -These are in contrast to the Numpy array and Matlab matrix, which are both -thin wrappers around a contiguous chunk of memory, and which provide -blazing-fast performance (because behind the scenes in both Numpy and Matlab, -it's C, C++ and FORTRAN all the way down). +___Numy array == Matlab matrix:___ These are in contrast to the Numpy array +and Matlab matrix, which are both thin wrappers around a contiguous chunk of +memory, and which provide blazing-fast performance (because behind the scenes +in both Numpy and Matlab, it's C, C++ and FORTRAN all the way down). So you should strongly consider turning those lists into Numpy arrays: @@ -89,8 +120,7 @@ xyz_coords = np.array([[-11.4, 1.0, 22.6], If you look carefully at the code above, you will notice that we are still actually using Python lists. We have declared our data sets in exactly the -same way that we did earlier, by denoting them with square brackets `[` and -`]`. +same way as we did earlier, by denoting them with square brackets `[` and `]`. The key difference here is that these lists immediately get converted into @@ -113,12 +143,826 @@ xyz_coords = np.array(xyz_coords) ``` -I'm emphasising this to help you understand the difference between Python -lists and Numpy arrays. Apologies if you've already got it. +Of course, in practice, we would never create a Numpy array in this way - we +will be loading our data from text or binary files directly into a Numpy +array, thus completely bypassing the use of Python lists and the costly +list-to-array conversion. I'm emphasising this to help you understand the +difference between Python lists and Numpy arrays. Apologies if you've already +got it, forgiveness please. + + +<a class="anchor" id="numpy-basics"></a> +## Numpy basics + + +Let's get started. + + +``` +import numpy as np +``` + + +<a class="anchor" id="creating-arrays"></a> +### Creating arrays + + +Numpy has quite a few functions which behave similarly to their equivalents in +Matlab: + + +``` +print('np.zeros gives us zeros: ', np.zeros(5)) +print('np.ones gives us ones: ', np.ones(5)) +print('np.arange gives us a range: ', np.arange(5)) +print('np.linspace gives us N linearly spaced numbers:', np.linspace(0, 1, 5)) +print('np.random.random gives us random numbers [0-1]:', np.random.random(5)) +print('np.random.randint gives us random integers: ', np.random.randint(1, 10, 5)) +print('np.eye gives us an identity matrix:') +print(np.eye(4)) +print('np.diag gives us a diagonal matrix:') +print(np.diag([1, 2, 3, 4])) +``` + + +> See the [appendix](#appendix-generating-random-numbers) for more information +> on generating random numbers in Numpy. + + +The `zeros` and `ones` functions can also be used to generate N-dimensional +arrays: + + +``` +z = np.zeros((3, 4)) +o = np.ones((2, 10)) +print(z) +print(o) +``` + + +> Note that, in a 2D Numpy array, the first axis corresponds to rows, and the +> second to columns - just like in Matlab. + + +<a class="anchor" id="loading-text-files"></a> +### Loading text files + + +The `numpy.loadtxt` function is capable of loading numerical data from +plain-text files. By default it expects space-separated data: + + +``` +data = np.loadtxt('04_numpy/space_separated.txt') +print('data in 04_numpy/space_separated.txt:') +print(data) +``` + + +But you can also specify the delimiter to expect<sup>1</sup>: + + +``` +data = np.loadtxt('04_numpy/comma_separated.txt', delimiter=',') +print('data in 04_numpy/comma_separated.txt:') +print(data) +``` + + +> <sup>1</sup> And many other things such as file headers, footers, comments, +> and newline characters - see the +> [docs](https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html) +> for more information. + + + Of course you can also save data out to a text file just as easily, with +[`numpy.savetxt`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html): + + +``` +data = np.random.randint(1, 10, (10, 10)) +np.savetxt('mydata.txt', data, delimiter=',', fmt='%i') + +with open('mydata.txt', 'rt') as f: + for line in f: + print(line.strip()) +``` + + + +<a class="anchor" id="array-properties"></a> +### Array properties + + +Numpy is a bit different than Matlab in the way that you interact with +arrays. In Matlab, you would typically pass an array to a built-in function, +e.g. `size(M)`, `ndims(M)`, etc. In contrast, a Numpy array is a Python +object which has _attributes_ that contain basic information about the array: + + +``` +z = np.zeros((2, 3, 4)) +print(z) +print('Shape: ', z.shape) +print('Number of dimensions: ', z.ndim) +print('Number of elements: ', z.size) +print('Data type: ', z.dtype) +print('Number of bytes: ', z.nbytes) +print('Length of first dimension: ', len(z)) +``` + + +> As depicted above, passing a Numpy array to the built-in `len` function will +> only give you the length of the first dimension, so you will typically want +> to avoid using it - use the `size` attribute instead. + + +<a class="anchor" id="descriptive-statistics"></a> +### Descriptive statistics + + +Similarly, a Numpy array has a set of methods<sup>2</sup> which allow you to +calculate basic descriptive statistics on an array: + + +``` +a = np.random.random(10) +print('a: ', a) +print('min: ', a.min()) +print('max: ', a.max()) +print('index of min: ', a.argmin()) # remember that in Python, list indices +print('index of max: ', a.argmax()) # start from zero - Numpy is the same! +print('mean: ', a.mean()) +print('variance: ', a.var()) +print('stddev: ', a.std()) +print('sum: ', a.sum()) +print('prod: ', a.prod()) +``` + + +> <sup>2</sup> Python, being an object-oriented language, distinguishes +> between _functions_ and _methods_. Hopefully we all know what a function is +> - a _method_ is simply the term used to refer to a function that is +> associated with a specific object. Similarly, the term _attribute_ is used +> to refer to some piece of information that is attached to an object, such as +> `z.shape`, or `z.dtype`. + + +<a class="anchor" id="reshaping-and-rearranging-arrays"></a> +### Reshaping and rearranging arrays + + +A numpy array can be reshaped very easily, using the `reshape` method. + +``` +a = np.arange(16).reshape(4, 4) +b = a.reshape((2, 8)) +print('a:') +print(a) +print('b:') +print(b) +``` + + +Note that this does not modify the underlying data in any way - the `reshape` +method returns a _view_ of the same array, just indexed differently: + + +``` +a[3, 3] = 12345 +b[0, 7] = 54321 +print('a:') +print(a) +print('b:') +print(b) +``` + + +If you need to create a reshaped copy of an array, use the `np.array` +function: + + +``` +a = np.arange(16).reshape((4, 4)) +b = np.array(a.reshape(2, 8)) +a[3, 3] = 12345 +b[0, 7] = 54321 +print('a:') +print(a) +print('b:') +print(b) +``` + + +The `T` attribute is a shortcut to obtain the transpose of an array. + + +``` +a = np.arange(16).reshape((4, 4)) +print(a) +print(a.T) +``` + + +The `transpose` method allows you to obtain more complicated rearrangements +of an N-dimensional array's axes: + + +``` +a = np.arange(24).reshape((2, 3, 4)) +b = a.transpose((2, 0, 1)) +print('a: ', a.shape) +print(a) +print('b:', b.shape) +print(b) +``` + + +> Note again that the `T` attribute and `transpose` method return _views_ of +> your array. + + +Numpy has some useful functions which allow you to concatenate or stack +multiple arrays into one. The `concatenate` function does what it says on the +tin: + + +``` +a = np.zeros(3) +b = np.ones(3) + +print('1D concatenation:', np.concatenate((a, b))) + +a = np.zeros((3, 3)) +b = np.ones((3, 3)) + +print('2D column-wise concatenation:') +print(np.concatenate((a, b), axis=1)) + +print('2D row-wise concatenation:') + +# The axis parameter defaults to 0, +# so it is not strictly necessary here. +print(np.concatenate((a, b), axis=0)) +``` + + +The `hstack`, `vstack` and `dstack` functions allow you to concatenate vectors +or arrays along the first, second, or third dimension respectively: + +``` +a = np.zeros(3) +b = np.ones(3) + +print('a: ', a) +print('b: ', b) + +hstacked = np.hstack((a, b)) +vstacked = np.vstack((a, b)) +dstacked = np.dstack((a, b)) + +print('hstacked: (shape {}):'.format(hstacked.shape)) +print( hstacked) +print('vstacked: (shape {}):'.format(vstacked.shape)) +print( vstacked) +print('dstacked: (shape {}):'.format(dstacked.shape)) +print( dstacked) +``` + + +<a class="anchor" id="operating-on-arrays"></a> +## Operating on arrays + + +If you are coming from Matlab, you should read this section as, while many +Numpy operations behave similarly to Matlab, there are a few important +behaviours which differ from what you might expect. + + +<a class="anchor" id="scalar-operations"></a> +### Scalar operations + + +All of the mathematical operators you know and love can be applied to Numpy +arrays: + + +``` +a = np.arange(1, 10).reshape((3, 3)) +print('a:') +print(a) +print('a + 2:') +print( a + 2) +print('a * 3:') +print( a * 3) +print('a % 2:') +print( a % 2) +``` + + +<a class="anchor" id="multi-variate-operations"></a> +### Multi-variate operations + + +Many operations in Numpy operate on an element-wise basis. For example: + + +``` +a = np.ones(5) +b = np.random.randint(1, 10, 5) + +print('a: ', a) +print('b: ', b) +print('a + b: ', a + b) +print('a * b: ', a * b) +``` + + +This also extends to higher dimensional arrays: + + +``` +a = np.ones((4, 4)) +b = np.arange(16).reshape((4, 4)) + +print('a:') +print(a) +print('b:') +print(b) + +print('a + b') +print(a + b) +print('a * b') +print(a * b) +``` + + +Wait ... what's that you say? Oh, I couldn't understand because of all the +froth coming out of your mouth. I guess you're angry that `a * b` didn't give +you the matrix product, like it would have in Matlab. Well all I can say is +that Numpy is not Matlab. Matlab operations are typically consistent with +linear algebra notation. This is not the case in Numpy. Get over it. Take a +calmative. + + +<a class="anchor" id="matrix-multiplication"></a> +### Matrix multiplication + + +When your heart rate has returned to its normal caffeine-induced state, you +can use the `@` operator or the `dot` method, to perform matrix +multiplication: + + +``` +a = np.arange(1, 5).reshape((2, 2)) +b = a.T + +print('a:') +print(a) +print('b:') +print(b) + +print('a @ b') +print(a @ b) + +print('a.dot(b)') +print(a.dot(b)) + +print('b.dot(a)') +print(b.dot(a)) +``` + + +> The `@` matrix multiplication operator is a relatively recent addition to +> Python and Numpy, so you might not see it all that often in existing +> code. But it's here to stay, so if you don't need to worry about +> backwards-compatibility, go ahead and use it! + + +<a class="anchor" id="broadcasting"></a> +### Broadcasting + + +One of the coolest features of Numpy is _broadcasting_<sup>3</sup>. +Broadcasting allows you to perform element-wise operations on arrays which +have a different shape. For each axis in the two arrays, Numpy will implicitly +expand the shape of the smaller axis to match the shape of the larger one. You +never need to use `repmat` ever again! + + +> <sup>3</sup>Mathworks have shamelessly stolen Numpy's broadcasting behaviour +> and included it in Matlab versions from 2016b onwards, referring to it as +> _implicit expansion_. + + +Broadcasting allows you to, for example, add the elements of a 1D vector to +all of the rows or columns of a 2D array: + + +``` +a = np.arange(9).reshape((3, 3)) +b = np.arange(1, 4) +print('a:') +print(a) +print('b: ', b) +print('a * b (row-wise broadcasting):') +print(a * b) +print('a * b.T (column-wise broadcasting):') +print(a * b.reshape(-1, 1)) +``` + + +> Here we used a handy feature of the `reshape` method - if you pass `-1` for +> the size of one dimension, it will automatically determine the size to use +> for that dimension. Take a look at [the +> appendix](#appendix-vectors-in-numpy) for a discussion on vectors in Numpy. + + +Here is a more useful example, where we use broadcasting to de-mean the rows +or columns of an array: + + +``` +a = np.arange(9).reshape((3, 3)) +print('a:') +print(a) + +print('a (cols demeaned):') +print(a - a.mean(axis=0)) + +print('a (rows demeaned):') +print(a - a.mean(axis=1).reshape(-1, 1)) +``` + + +> As demonstrated above, many functions in Numpy accept an `axis` parameter, +> allowing you to apply the function along a specific axis. Omitting the +> `axis` parameter will apply the function to the whole array. + +Broadcasting can sometimes be confusing, but the rules which Numpy follows to +align arrays of different sizes, and hence determine how the broadcasting +should be applied, are pretty straightforward. If something is not working, +and you can't figure out why refer to the [official +documentation](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html). -<a class="anchor" id="importing-numpy"></a> -## Importing numpy + +<a class="anchor" id="linear-algebra"></a> +### Linear algebra + + +Numpy is first and foremost a library for general-purpose numerical computing. +But it does have a range of linear algebra functionality, hidden away in the +[`numpy.linalg`](https://docs.scipy.org/doc/numpy/reference/routines.linalg.html) +module. Here are a couple of quick examples: + + +``` +import numpy.linalg as npla + +a = np.array([[1, 2, 3, 4], + [0, 5, 6, 7], + [0, 0, 8, 9], + [0, 0, 0, 10]]) + +print('a:') +print(a) + +print('inv(a)') +print(npla.inv(a)) + +eigvals, eigvecs = npla.eig(a) + +print('eigenvalues and vectors of a:') +for val, vec in zip(eigvals, eigvecs): + print('{:2.0f} - {}'.format(val, vec)) +``` + + +<a class="anchor" id="array-indexing"></a> +## Array indexing + + +Just like in Matlab, slicing up your arrays is a breeze in Numpy. If you are +after some light reading, you might want to check out the [comprehensive Numpy +Indexing +reference](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html). + + +> As with indexing regular Python lists, array indices start from 0, and end +> indices (if specified) are exclusive. + + +Let's whet our appetites with some basic 1D array slicing. Numpy supports the +standard Python +[__slice__](https://www.pythoncentral.io/how-to-slice-listsarrays-and-tuples-in-python/) +notation for indexing, where you can specify the start and end indices, and +the step size, via the `start:stop:step` syntax: + + +``` +a = np.arange(10) + +print('a: ', a) +print('first element: ', a[0]) +print('first two elements: ', a[:2]) +print('last element: ', a[a.shape[0] - 1]) +print('last element again: ', a[-1]) +print('last two elements: ', a[-2:]) +print('middle four elements: ', a[3:7]) +print('Every second element: ', a[1::2]) +print('Every second element, reversed: ', a[-1::-2]) +``` + + +Note that slicing an array in this way returns a _view_, not a copy, into that +array: + + +``` +a = np.arange(10) +print('a:', a) +every2nd = a[::2] +print('every 2nd:', every2nd) +every2nd += 10 +print('a:', a) +``` + + +<a class="anchor" id="indexing-multi-dimensional-arrays"></a> +### Indexing multi-dimensional arrays + + +Multi-dimensional array indexing works in much the same way as one-dimensional +indexing but with, well, more dimensions. Use commas within the square +brackets to separate the slices for each dimension: + + +``` +a = np.arange(25).reshape((5, 5)) +print('a:') +print(a) +print(' First row: ', a[ 0, :]) +print(' Last row: ', a[ -1, :]) +print(' second column: ', a[ :, 1]) +print(' Centre:') +print(a[1:4, 1:4]) +``` + + +For arrays with more than two dimensions, the ellipsis (`...`) is a handy +feature - it allows you to specify a slice comprising all elements along +more than one dimension: + + +``` +a = np.arange(27).reshape((3, 3, 3)) +print('a:') +print(a) +print('All elements at x=0:') +print(a[0, ...]) +print('All elements at z=2:') +print(a[..., 2]) +print('All elements at x=0, z=2:') +print(a[0, ..., 2]) +``` + + +<a class="anchor" id="boolean-indexing"></a> +### Boolean indexing + + +A numpy array can be indexed with a boolean array of the same shape. For +example: + + +``` +a = np.arange(10) + +print('a: ', a) +print('a > 5: ', a > 4) +print('elements in a that are > 5: ', a[a > 5]) +``` + + +In contrast to the simple indexing we have already seen, boolean indexing will +return a _copy_ of the indexed data, __not__ a view. For example: + + +``` +a = np.arange(10) +b = a[a > 5] +print('a: ', a) +print('b: ', b) +print('Setting b[0] to 999') +b[0] = 999 +print('a: ', a) +print('b: ', b) +``` + + +> In general, any 'simple' indexing operation on a Numpy array, where the +> indexing object comprises integers, slices (using the standard Python +> `start:stop:step` notation), colons (`:`) and/or ellipses (`...`), will +> result in a __view__ into the indexed array. Any 'advanced' indexing +> operation, where the indexing object contains anything else (e.g. boolean or +> integer arrays, or even python lists), will result in a __copy__ of the +> data. + + +Logical operators `~` (not), `&` (and) and `|` (or) can be used to manipulate +and combine boolean Numpy arrays: + + +``` +a = np.arange(10) +gt5 = a > 5 +even = a % 2 == 0 + +print('a: ', a) +print('elements in a which are > 5: ', a[ gt5]) +print('elements in a which are <= 5: ', a[~gt5]) +print('elements in a which are even: ', a[ even]) +print('elements in a which are odd: ', a[~even]) +print('elements in a which are > 5 and even: ', a[gt5 & even]) +print('elements in a which are > 5 or odd: ', a[gt5 | ~even]) +``` + + +Numpy also has two handy functions, `all` and `any`, which respectively allow +you to perform boolean `and` and `or` operations along the axes of an array: + + +``` +a = np.arange(9).reshape((3, 3)) + +print('a:') +print(a) +print('rows with any element divisible by 3: ', np.any(a % 3 == 0, axis=1)) +print('cols with any element divisible by 3: ', np.any(a % 3 == 0, axis=0)) +print('rows with all elements divisible by 3:', np.all(a % 3 == 0, axis=1)) +print('cols with all elements divisible by 3:', np.all(a % 3 == 0, axis=0)) +``` + + + + +<a class="anchor" id="coordinate-array-indexing"></a> +### Coordinate array indexing + + +You can index a numpy array using another array containing coordinates into +the first array. As with boolean indexing, this will result in a copy of the +data. Generally, you will need to have a separate array, or list, of +coordinates for each axis of the array you wish to index: + + +``` +a = np.arange(16).reshape((4, 4)) +print('a:') +print(a) + +rows = [0, 2, 3] +cols = [1, 0, 2] +indexed = a[rows, cols] + +for r, c, v in zip(rows, cols, indexed): + print('a[{}, {}] = {}'.format(r, c, v)) +``` + + +The `numpy.where` function can be combined with boolean arrays to easily +generate of coordinate arrays for values which meet some condition: + + +``` +a = np.arange(16).reshape((4, 4)) +print('a:') +print(a) + +evenrows, evencols = np.where(a % 2 == 0) + +print('even row coordinates:', evenx) +print('even col coordinates:', eveny) + +print(a[evenrows, evencols]) +``` + + + +<a class="anchor" id="exercises"></a> +## Exercises + + +The challenge for each of these exercises is to complete them in as few lines +of code as possible! + + +> You can find example answers to the exercises in `04_numpy/.solutions`. + + +<a class="anchor" id="load-an-array-from-a-file-and-do-stuff-with-it"></a> +### Load an array from a file and do stuff with it + + +Load the file `04_numpy/2d_array.txt`, and calculate and print the mean for +each column. If your code doesn't work, you might want to __LOOK AT YOUR +DATA__, as you will have learnt if you have ever attended the FSL course. + + +> Bonus: Find the hidden message (hint: +> [chr](https://docs.python.org/3/library/functions.html#chr)) + + +<a class="anchor" id="concatenate-affine-transforms"></a> +### Concatenate affine transforms + + +Given all of the files in `04_numpy/xfms/`, create a transformation matrix +which can transform coordinates from subject 1 functional space to subject 2 +functional space<sup>4</sup>. + +Write some code to transform the following coordinates from subject 1 +functional space to subject 2 functional space, to test that your matrix is +correct: + + +| __Subject 1 coordinates__ | __Subject 2 coordinates__ | +|:-------------------------:|:-------------------------:| +| `[ 0.0, 0.0, 0.0]` | `[ 3.21, 4.15, -9.89]` | +| `[-5.0, -20.0, 10.0]` | `[-0.87, -14.36, -1.13]` | +| `[20.0, 25.0, 60.0]` | `[29.58, 27.61, 53.37]` | + + +> <sup>4</sup> Even though these are FLIRT transforms, this is just a toy +> example. Look +> [here](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FLIRT/FAQ#What_is_the_format_of_the_matrix_used_by_FLIRT.2C_and_how_does_it_relate_to_the_transformation_parameters.3F) +> and +> [here](https://git.fmrib.ox.ac.uk/fsl/fslpy/blob/1.6.2/fsl/utils/transform.py#L537) +> if you actually need to work with FLIRT transforms. + + + +<a class="anchor" id="appendix-generating-random-numbers"></a> +## Appendix A: Generating random numbers + + +Numpy's +[`numpy.random`](https://docs.scipy.org/doc/numpy/reference/routines.random.html) +module is where you should go if you want to introduce a little randomness +into your code. You have already seen a couple of functions for generating +uniformly distributed real or integer data: + + +``` +import numpy.random as npr + +print('Random floats between 0.0 (inclusive) and 1.0 (exclusive):') +print(npr.random((3, 3))) + +print('Random integers in a specified range:') +print(npr.randint(1, 100, (3, 3))) +``` + + +You can also draw random data from other distributions - here are just a few +examples: + + +``` +print('Gaussian (mean: 0, stddev: 1):') +print(npr.normal(0, 1, (3, 3))) + +print('Gamma (shape: 1, scale: 1):') +print(npr.normal(1, 1, (3, 3))) + +print('Chi-square (dof: 10):') +print(npr.chisquare(10, (3, 3))) +``` + + +The `numpy.random` module also has a couple of other handy functions for +random sampling of existing data: + + +``` +data = np.arange(5) + +print('data: ', data) +print('two random values: ', npr.choice(data, 2)) +print('random permutation: ', npr.permutation(data)) + +# The numpy.random.shuffle function +# will shuffle an array *in-place*. +npr.shuffle(data) +print('randomly shuffled: ', data) +``` + + +<a class="anchor" id="appendix-importing-numpy"></a> +## Appendix B: Importing Numpy For interactive exploration/experimentation, you might want to import @@ -146,11 +990,13 @@ print(d) But if you are writing a script or application using Numpy, I implore you to -Numpy like this instead: +Numpy (and its commonly used sub-modules) like this instead: ``` -import numpy as np +import numpy as np +import numpy.random as npr +import numpy.linalg as npla ``` @@ -162,10 +1008,12 @@ The downside to this is that you will have to prefix all Numpy functions with e = np.array([1, 2, 3, 4, 5]) z = np.zeros((100, 100)) d = np.diag([2, 3, 4, 5]) +r = npr.random(5) print(e) print(z) print(d) +print(r) ``` @@ -175,17 +1023,42 @@ figure out what the hell your code is doing. Namespaces are your friend - use them! -<a class="anchor" id="numpy-basics"></a> -## Numpy basics +<a class="anchor" id="appendix-vectors-in-numpy"></a> +## Appendix C: Vectors in Numpy -Let's get started. +One aspect of Numpy which might trip you up, and which can be quite +frustrating at times, is that Numpy has no understanding of row or column +vectors. __An array with only one dimension is neither a row, nor a column +vector - it is just a 1D array__. If you have a 1D array, and you want to use +it as a row vector, you need to reshape it to a shape of `(1, N)`. Similarly, +to use a 1D array as a column vector, you must reshape it to have shape +`(N, 1)`. + + +In general, when you are mixing 1D arrays with 2- or N-dimensional arrays, you +need to make sure that your arrays have the correct shape. For example: ``` -import numpy as np +r = np.random.randint(1, 10, 3) + +print('r is a row: ', r) +print('r.T should be a column: ', r.T, ' ... huh?') +print('Ok, make n a 2D array with one row: ', r.reshape(1, -1)) +print('We could also use the np.atleast_2d function:', np.atleast_2d(r)) +print('Now we can transpose r to get a column:') +print(np.atleast_2d(r).T) ``` -<a class="anchor" id="indexing"></a> -## Indexing +<a class="anchor" id="useful-references"></a> +## Useful references + + +* [The Numpy manual](https://docs.scipy.org/doc/numpy/) +* [Linear algebra in `numpy.linalg`](https://docs.scipy.org/doc/numpy/reference/routines.linalg.html) +* [Broadcasting in Numpy](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) +* [Indexing in Numpy](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html) +* [Random sampling in `numpy.random`](https://docs.scipy.org/doc/numpy/reference/routines.random.html) +* [Python slicing](https://www.pythoncentral.io/how-to-slice-listsarrays-and-tuples-in-python/) \ No newline at end of file diff --git a/getting_started/04_numpy/.solutions/calc_column_mean.py b/getting_started/04_numpy/.solutions/calc_column_mean.py new file mode 100644 index 0000000000000000000000000000000000000000..91abf3cdc499ce62c319b95326a35255a4f2a653 --- /dev/null +++ b/getting_started/04_numpy/.solutions/calc_column_mean.py @@ -0,0 +1,10 @@ +#!/usr/bin/env python +import numpy as np + +data = np.loadtxt('04_numpy/2d_array.txt', comments='%') +colmeans = data.mean(axis=0) +msg = ''.join([chr(int(round(m))) for m in colmeans]) + +print('Column means') +print('\n'.join(['{}: {}'.format(i, m) for i, m in enumerate(colmeans)])) +print('Secret message:', msg) diff --git a/getting_started/04_numpy/.solutions/concat_affines.py b/getting_started/04_numpy/.solutions/concat_affines.py new file mode 100644 index 0000000000000000000000000000000000000000..a097e39545fb957f74766f63eaa5113f11cbabc3 --- /dev/null +++ b/getting_started/04_numpy/.solutions/concat_affines.py @@ -0,0 +1,44 @@ +#!/usr/bin/env python + +import numpy as np +import numpy.linalg as npla + +def concat(*xforms): + """Combines the given matrices (returns the dot product).""" + + result = xforms[0] + + for i in range(1, len(xforms)): + result = np.dot(result, xforms[i]) + + return result + + +def transform(xform, coord): + """Transform the given coordinates with the given affine. """ + return np.dot(xform[:3, :3], coord) + xform[:3, 3] + + +s1func2struc = np.loadtxt('04_numpy/xfms/subj1/example_func2highres.mat') +s1struc2std = np.loadtxt('04_numpy/xfms/subj1/highres2standard.mat') +s2func2struc = np.loadtxt('04_numpy/xfms/subj2/example_func2highres.mat') +s2struc2std = np.loadtxt('04_numpy/xfms/subj2/highres2standard.mat') + +s1func2std = concat(s1struc2std, s1func2struc) +s2func2std = concat(s2struc2std, s2func2struc) +s2std2func = npla.inv(s2func2std) +s1func2s2func = concat(s2std2func, s1func2std) + +print('Subject 1 functional -> Subject 2 functional affine:') +print(s1func2s2func) + +testcoords = np.array([[ 0, 0, 0], + [-5, -20, 10], + [20, 25, 60]], dtype=np.float32) + + +for c in testcoords: + xc = transform(s1func2s2func, c) + c = '{:6.2f} {:6.2f} {:6.2f}'.format(*c) + xc = '{:6.2f} {:6.2f} {:6.2f}'.format(*xc) + print('Transform: [{}] -> [{}])'.format(c, xc)) diff --git a/getting_started/04_numpy/04_numpy.ipynb b/getting_started/04_numpy/04_numpy.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/getting_started/04_numpy/2d_array.txt b/getting_started/04_numpy/2d_array.txt new file mode 100644 index 0000000000000000000000000000000000000000..923f35ecad43a521cf1de5bbbe73e2fbbe265b4d --- /dev/null +++ b/getting_started/04_numpy/2d_array.txt @@ -0,0 +1,13 @@ +% The purpose of this comment is purely +% to make life difficult for you. Sorry +% about that. I'm not really sorry. +128.3 100.8 67.2 120.1 150.2 53.0 64.2 139.3 46.7 118.1 125.8 153.1 83.2 115.9 3.4 126.3 92.2 104.7 131.2 29.3 + 89.3 118.8 119.2 80.1 121.2 35.0 66.2 153.3 43.7 102.1 147.8 160.1 94.2 140.9 70.4 124.3 124.2 93.7 100.2 32.3 + 91.3 146.8 137.2 129.1 157.2 -22.0 75.2 90.3 33.7 86.1 74.8 90.1 114.2 81.9 72.4 135.3 85.2 102.7 58.2 55.3 + 75.3 87.8 84.2 112.1 131.2 57.0 150.2 73.3 41.7 110.1 78.8 102.1 116.2 134.9 34.4 74.3 150.2 52.7 89.2 21.3 + 79.3 83.8 78.2 103.1 96.2 26.0 151.2 122.3 46.7 113.1 143.8 80.1 61.2 136.9 8.4 94.3 76.2 123.7 150.2 10.3 + 93.3 157.8 154.2 79.1 155.2 8.0 63.2 120.3 12.7 130.1 105.8 111.1 66.2 99.9 54.4 120.3 113.2 128.7 110.2 46.3 +111.3 126.8 65.2 90.1 101.2 36.0 116.2 139.3 36.7 91.1 88.8 94.1 90.2 92.9 -0.6 46.3 80.2 105.7 88.2 -11.7 +149.3 119.8 134.2 124.1 89.2 21.0 133.2 113.3 28.7 139.1 136.8 82.1 141.2 90.9 11.4 97.3 109.2 137.7 120.2 32.3 +141.3 149.8 115.2 115.1 105.2 51.0 77.2 130.3 -4.3 124.1 141.8 162.1 95.2 130.9 -10.6 114.3 132.2 134.7 79.2 69.3 +141.3 77.8 135.2 167.1 103.2 55.0 153.2 68.3 33.7 136.1 125.8 85.1 148.2 114.9 76.4 57.3 147.2 125.7 153.2 45.3 diff --git a/getting_started/04_numpy/comma_separated.txt b/getting_started/04_numpy/comma_separated.txt new file mode 100644 index 0000000000000000000000000000000000000000..9aa1ff615e7cce6765f308a9e2ae05fcab1a2fd9 --- /dev/null +++ b/getting_started/04_numpy/comma_separated.txt @@ -0,0 +1,10 @@ +13,7,3,18,15,3,12,11,9,9 +13,1,7,17,16,13,18,9,18,6 +9,19,16,3,18,3,19,12,9,6 +2,11,6,12,2,11,15,9,3,9 +2,12,1,7,4,3,6,6,2,4 +10,8,14,1,17,19,8,19,2,9 +18,14,2,11,17,14,6,16,14,18 +6,8,13,16,11,17,16,5,16,15 +6,5,4,18,6,14,19,8,4,15 +15,17,12,10,17,12,5,9,18,6 diff --git a/getting_started/04_numpy/space_separated.txt b/getting_started/04_numpy/space_separated.txt new file mode 100644 index 0000000000000000000000000000000000000000..f5fce0dafc3a5ac00aaecf1544e197a2f863eb1f --- /dev/null +++ b/getting_started/04_numpy/space_separated.txt @@ -0,0 +1,20 @@ +1 3 +14 3 +4 7 +12 5 +15 5 +14 4 +15 19 +7 17 +7 15 +16 3 +15 14 +17 9 +7 8 +4 5 +15 4 +11 5 +17 10 +3 19 +14 4 +18 3 diff --git a/getting_started/04_numpy/xfms/subj1/example_func2highres.mat b/getting_started/04_numpy/xfms/subj1/example_func2highres.mat new file mode 100755 index 0000000000000000000000000000000000000000..c5f9b625d09d921e180c3da602350a4673ebd3ac --- /dev/null +++ b/getting_started/04_numpy/xfms/subj1/example_func2highres.mat @@ -0,0 +1,4 @@ +1.027343449 -0.01854307437 0.02702291658 -3.429567172 +0.01572791398 0.9487250481 -0.30721057 17.65058627 +-0.02892982284 0.2868726435 1.077910798 -3.247641029 +0 0 0 1 diff --git a/getting_started/04_numpy/xfms/subj1/highres2standard.mat b/getting_started/04_numpy/xfms/subj1/highres2standard.mat new file mode 100755 index 0000000000000000000000000000000000000000..e11d282590763d2f8d7bc9e6e402694410357dbd --- /dev/null +++ b/getting_started/04_numpy/xfms/subj1/highres2standard.mat @@ -0,0 +1,4 @@ +0.9998006214 -0.007506170473 -0.00206080048 -37.86830353 +0.01249141095 0.863260387 0.4957794812 -33.77969415 +0.03530937349 -0.4932384691 1.042606617 27.59239864 +0 0 0 1 diff --git a/getting_started/04_numpy/xfms/subj2/example_func2highres.mat b/getting_started/04_numpy/xfms/subj2/example_func2highres.mat new file mode 100755 index 0000000000000000000000000000000000000000..101bf79dbf8f759df4a2d1275040fcb97ce30de4 --- /dev/null +++ b/getting_started/04_numpy/xfms/subj2/example_func2highres.mat @@ -0,0 +1,4 @@ +1.0271367 -0.01315117882 -0.005127727479 -3.084298759 +0.002836833979 0.9879611849 -0.002646076731 -0.4778179727 +0.004268282598 -0.0002074879811 1.067805712 25.30182119 +0 0 0 1 diff --git a/getting_started/04_numpy/xfms/subj2/highres2standard.mat b/getting_started/04_numpy/xfms/subj2/highres2standard.mat new file mode 100755 index 0000000000000000000000000000000000000000..3f7c3cdc50290c55a747ac57c907132cd17f838a --- /dev/null +++ b/getting_started/04_numpy/xfms/subj2/highres2standard.mat @@ -0,0 +1,4 @@ +1.077742594 -0.04288767681 -0.08886849541 -40.1774847 +0.03613742149 1.017656652 0.2363464233 -27.41273922 +0.07295070007 -0.3129031707 1.14704109 -0.4162223149 +0 0 0 1