Merge branch 'master' into 'master'

Master See merge request fsl/pytreat-2018-practicals!46

Merge branch 'master' into 'master'
deb8d990 · Paul McCarthy · 5bf3ed18 · 51c0d5e9 · deb8d990 · deb8d990
Commit deb8d990 authored 7 years ago by Paul McCarthy
--- a/advanced_topics/06_decorators.ipynb
+++ b/advanced_topics/06_decorators.ipynb
@@ -498,8 +498,10 @@
    "                # remove the oldest item. In practice\n",
    "                # it would make more sense to remove\n",
    "                # the item with the oldest access\n",
-    "                # time, but this is good enough for\n",
+    "                # time (or remove the least recently\n",
-    "                # an introduction!\n",
+    "                # used item, as the built-in\n",
+    "                # @functools.lru_cache does), but this\n",
+    "                # is good enough for now!\n",
    "                if len(cache) >= maxSize:\n",
    "                    cache.popitem(last=False)\n",
    "\n",

 %% Cell type:markdown id: tags:
 # Decorators
 Remember that in Python, everything is an object, including functions. This
 means that we can do things like:
 - Pass a function as an argument to another function.
 - Create/define a function inside another function.
 - Write a function which returns another function.
 These abilities mean that we can do some neat things with functions in Python.
 * [Overview](#overview)
 * [Decorators on methods](#decorators-on-methods)
 * [Example - memoization](#example-memoization)
 * [Decorators with arguments](#decorators-with-arguments)
 * [Chaining decorators](#chaining-decorators)
 * [Decorator classes](#decorator-classes)
 * [Appendix: Functions are not special](#appendix-functions-are-not-special)
 * [Appendix: Closures](#appendix-closures)
 * [Appendix: Decorators without arguments versus decorators with arguments](#appendix-decorators-without-arguments-versus-decorators-with-arguments)
 * [Appendix: Per-instance decorators](#appendix-per-instance-decorators)
 * [Appendix: Preserving function metadata](#appendix-preserving-function-metadata)
 * [Appendix: Class decorators](#appendix-class-decorators)
 * [Useful references](#useful-references)
 <a class="anchor" id="overview"></a>
 ## Overview
 Let's say that we want a way to calculate the execution time of any function
 (this example might feel familiar to you if you have gone through the
 practical on operator overloading).
 Our first attempt at writing such a function might look like this:
 %% Cell type:code id: tags:
 ``` 
 import time
 def timeFunc(func, *args, **kwargs):
    start  = time.time()
    retval = func(*args, **kwargs)
    end    = time.time()
    print('Ran {} in {:0.2f} seconds'.format(func.__name__, end - start))
    return retval
 ```
 %% Cell type:markdown id: tags:
 The `timeFunc` function accepts another function, `func`, as its first
 argument. It calls `func`, passing it all of the other arguments, and then
 prints the time taken for `func` to complete:
 %% Cell type:code id: tags:
 ``` 
 import numpy        as np
 import numpy.linalg as npla
 def inverse(a):
    return npla.inv(a)
 data    = np.random.random((2000, 2000))
 invdata = timeFunc(inverse, data)
 ```
 %% Cell type:markdown id: tags:
 But this means that whenever we want to time something, we have to call the
 `timeFunc` function directly. Let's take advantage of the fact that we can
 define a function inside another funciton. Look at the next block of code
 carefully, and make sure you understand what our new `timeFunc` implementation
 is doing.
 %% Cell type:code id: tags:
 ``` 
 import time
 def timeFunc(func):
    def wrapperFunc(*args, **kwargs):
        start  = time.time()
        retval = func(*args, **kwargs)
        end    = time.time()
        print('Ran {} in {:0.2f} seconds'.format(func.__name__, end - start))
        return retval
    return wrapperFunc
 ```
 %% Cell type:markdown id: tags:
 This new `timeFunc` function is again passed a function `func`, but this time
 as its sole argument. It then creates and returns a new function,
 `wrapperFunc`. This `wrapperFunc` function calls and times the function that
 was passed to `timeFunc`.  But note that when `timeFunc` is called,
 `wrapperFunc` is _not_ called - it is only created and returned.
 Let's use our new `timeFunc` implementation:
 %% Cell type:code id: tags:
 ``` 
 import numpy        as np
 import numpy.linalg as npla
 def inverse(a):
    return npla.inv(a)
 data    = np.random.random((2000, 2000))
 inverse = timeFunc(inverse)
 invdata = inverse(data)
 ```
 %% Cell type:markdown id: tags:
 Here, we did the following:
 1. We defined a function called `inverse`:
  > ```
  > def inverse(a):
  >     return npla.inv(a)
  > ```
 2. We passed the `inverse` function to the `timeFunc` function, and
   re-assigned the return value of `timeFunc` back to `inverse`:
  > ```
  > inverse = timeFunc(inverse)
  > ```
 3. We called the new `inverse` function:
  > ```
  > invdata = inverse(data)
  > ```
 So now the `inverse` variable refers to an instantiation of `wrapperFunc`,
 which holds a reference to the original definition of `inverse`.
 > If this is not clear, take a break now and read through the appendix on how
 > [functions are not special](#appendix-functions-are-not-special).
 Guess what? We have just created a __decorator__. A decorator is simply a
 function which accepts a function as its input, and returns another function
 as its output. In the example above, we have _decorated_ the `inverse`
 function with the `timeFunc` decorator.
 Python provides an alternative syntax for decorating one function with
 another, using the `@` character. The approach that we used to decorate
 `inverse` above:
 %% Cell type:code id: tags:
 ``` 
 def inverse(a):
    return npla.inv(a)
 inverse = timeFunc(inverse)
 invdata = inverse(data)
 ```
 %% Cell type:markdown id: tags:
 is semantically equivalent to this:
 %% Cell type:code id: tags:
 ``` 
 @timeFunc
 def inverse(a):
    return npla.inv(a)
 invdata = inverse(data)
 ```
 %% Cell type:markdown id: tags:
 <a class="anchor" id="decorators-on-methods"></a>
 ## Decorators on methods
 Applying a decorator to the methods of a class works in the same way:
 %% Cell type:code id: tags:
 ``` 
 import numpy.linalg as npla
 class MiscMaths(object):
    @timeFunc
    def inverse(self, a):
        return npla.inv(a)
 ```
 %% Cell type:markdown id: tags:
 Now, the `inverse` method of all `MiscMaths` instances will be timed:
 %% Cell type:code id: tags:
 ``` 
 mm1 = MiscMaths()
 mm2 = MiscMaths()
 i1 = mm1.inverse(np.random.random((1000, 1000)))
 i2 = mm2.inverse(np.random.random((1500, 1500)))
 ```
 %% Cell type:markdown id: tags:
 Note that only one `timeFunc` decorator was created here - the `timeFunc`
 function was only called once - when the `MiscMaths` class was defined.  This
 might be clearer if we re-write the above code in the following (equivalent)
 manner:
 %% Cell type:code id: tags:
 ``` 
 class MiscMaths(object):
    def inverse(self, a):
        return npla.inv(a)
 MiscMaths.inverse = timeFunc(MiscMaths.inverse)
 ```
 %% Cell type:markdown id: tags:
 So only one `wrapperFunc` function exists, and this function is _shared_ by
 all instances of the `MiscMaths` class - (such as the `mm1` and `mm2`
 instances in the example above). In many cases this is not a problem, but
 there can be situations where you need each instance of your class to have its
 own unique decorator.
 > If you are interested in solutions to this problem, take a look at the
 > appendix on [per-instance decorators](#appendix-per-instance-decorators).
 <a class="anchor" id="example-memoization"></a>
 ## Example - memoization
 Let's move onto another example.
 [Meowmoization](https://en.wikipedia.org/wiki/Memoization) is a common
 performance optimisation technique used in cats. I mean software. Essentially,
 memoization refers to the process of maintaining a cache for a function which
 performs some expensive calculation. When the function is executed with a set
 of inputs, the calculation is performed, and then a copy of the inputs and the
 result are cached. If the function is called again with the same inputs, the
 cached result can be returned.
 This is a perfect problem to tackle with decorators:
 %% Cell type:code id: tags:
 ``` 
 def memoize(func):
    cache = {}
    def wrapper(*args):
        # is there a value in the cache
        # for this set of inputs?
        cached = cache.get(args, None)
        # If not, call the function,
        # and cache the result.
        if cached is None:
            cached      = func(*args)
            cache[args] = cached
        else:
            print('Cached {}({}): {}'.format(func.__name__, args, cached))
        return cached
    return wrapper
 ```
 %% Cell type:markdown id: tags:
 We can now use our `memoize` decorator to add a memoization cache to any
 function.  Let's memoize a function which generates the $n^{th}$ number in the
 [Fibonacci series](https://en.wikipedia.org/wiki/Fibonacci_number):
 %% Cell type:code id: tags:
 ``` 
 @memoize
 def fib(n):
    if n in (0, 1):
        print('fib({}) = {}'.format(n, n))
        return n
    twoback = 1
    oneback = 1
    val     = 1
    for _ in range(2, n):
        val     = oneback + twoback
        twoback = oneback
        oneback = val
    print('fib({}) = {}'.format(n, val))
    return val
 ```
 %% Cell type:markdown id: tags:
 For a given input, when `fib` is called the first time, it will calculate the
 $n^{th}$ Fibonacci number:
 %% Cell type:code id: tags:
 ``` 
 for i in range(10):
    fib(i)
 ```
 %% Cell type:markdown id: tags:
 However, on repeated calls with the same input, the calculation is skipped,
 and instead the result is retrieved from the memoization cache:
 %% Cell type:code id: tags:
 ``` 
 for i in range(10):
    fib(i)
 ```
 %% Cell type:markdown id: tags:
 > If you are wondering how the `wrapper` function is able to access the
 > `cache` variable, refer to the [appendix on closures](#appendix-closures).
 <a class="anchor" id="decorators-with-arguments"></a>
 ## Decorators with arguments
 Continuing with our memoization example, let's say that we want to place a
 limit on the maximum size that our cache can grow to. For example, the output
 of our function might have large memory requirements, so we can only afford to
 store a handful of pre-calculated results. It would be nice to be able to
 specify the maximum cache size when we define our function to be memoized,
 like so:
 > ```
 > # cache at most 10 results
 > @limitedMemoize(10):
 > def fib(n):
 >     ...
 > ```
 In order to support this, our `memoize` decorator function needs to be
 modified - it is currently written to accept a function as its sole argument,
 but we need it to accept a cache size limit.
 %% Cell type:code id: tags:
 ``` 
 from collections import OrderedDict
 def limitedMemoize(maxSize):
    cache = OrderedDict()
    def decorator(func):
        def wrapper(*args):
            # is there a value in the cache
            # for this set of inputs?
            cached = cache.get(args, None)
            # If not, call the function,
            # and cache the result.
            if cached is None:
                cached = func(*args)
                # If the cache has grown too big,
                # remove the oldest item. In practice
                # it would make more sense to remove
                # the item with the oldest access
-                # time, but this is good enough for
+                # time (or remove the least recently
-                # an introduction!
+                # used item, as the built-in
+                # @functools.lru_cache does), but this
+                # is good enough for now!
                if len(cache) >= maxSize:
                    cache.popitem(last=False)
                cache[args] = cached
            else:
                print('Cached {}({}): {}'.format(func.__name__, args, cached))
            return cached
        return wrapper
    return decorator
 ```
 %% Cell type:markdown id: tags:
 > We used the handy
 > [`collections.OrderedDict`](https://docs.python.org/3.5/library/collections.html#collections.OrderedDict)
 > class here which preserves the insertion order of key-value pairs.
 This is starting to look a little complicated - we now have _three_ layers of
 functions. This is necessary when you wish to write a decorator which accepts
 arguments (refer to the
 [appendix](#appendix-decorators-without-arguments-versus-decorators-with-arguments)
 for more details).
 But this `limitedMemoize` decorator is used in essentially the same way as our
 earlier `memoize` decorator:
 %% Cell type:code id: tags:
 ``` 
 @limitedMemoize(5)
 def fib(n):
    if n in (0, 1):
        print('fib({}) = 1'.format(n))
        return n
    twoback = 1
    oneback = 1
    val     = 1
    for _ in range(2, n):
        val     = oneback + twoback
        twoback = oneback
        oneback = val
    print('fib({}) = {}'.format(n, val))
    return val
 ```
 %% Cell type:markdown id: tags:
 Except that now, the `fib` function will only cache up to 5 values.
 %% Cell type:code id: tags:
 ``` 
 fib(10)
 fib(11)
 fib(12)
 fib(13)
 fib(14)
 print('The result for 10 should come from the cache')
 fib(10)
 fib(15)
 print('The result for 10 should no longer be cached')
 fib(10)
 ```
 %% Cell type:markdown id: tags:
 <a class="anchor" id="chaining-decorators"></a>
 ## Chaining decorators
 Decorators can easily be chained, or nested:
 %% Cell type:code id: tags:
 ``` 
 import time
 @timeFunc
 @memoize
 def expensiveFunc(n):
    time.sleep(n)
    return n
 ```
 %% Cell type:markdown id: tags:
 > Remember that this is semantically equivalent to the following:
 >
 > ```
 > def expensiveFunc(n):
 >     time.sleep(n)
 >     return n
 >
 > expensiveFunc = timeFunc(memoize(expensiveFunc))
 > ```
 Now we can see the effect of our memoization layer on performance:
 %% Cell type:code id: tags:
 ``` 
 expensiveFunc(0.5)
 expensiveFunc(1)
 expensiveFunc(1)
 ```
 %% Cell type:markdown id: tags:
 > Note that in Python 3.2 and newer you can use the
 > [`functools.lru_cache`](https://docs.python.org/3/library/functools.html#functools.lru_cache)
 > to memoize your functions.
 <a class="anchor" id="decorator-classes"></a>
 ## Decorator classes
 By now, you will have gained the impression that a decorator is a function
 which _decorates_ another function. But if you went through the practical on
 operator overloading, you might remember the special `__call__` method, that
 allows an object to be called as if it were a function.
 This feature allows us to write our decorators as classes, instead of
 functions. This can be handy if you are writing a decorator that has
 complicated behaviour, and/or needs to maintain some sort of state which
 cannot be easily or elegantly written using nested functions.
 As an example, let's say we are writing a framework for unit testing. We want
 to be able to "mark" our test functions like so, so they can be easily
 identified and executed:
 > ```
 > @unitTest
 > def testblerk():
 >     """tests the blerk algorithm."""
 >     ...
 > ```
 With a decorator like this, we wouldn't need to worry about where our tests
 are located - they will all be detected because we have marked them as test
 functions. What does this `unitTest` decorator look like?
 %% Cell type:code id: tags:
 ``` 
 class TestRegistry(object):
    def __init__(self):
        self.testFuncs = []
    def __call__(self, func):
        self.testFuncs.append(func)
    def listTests(self):
        print('All registered tests:')
        for test in self.testFuncs:
            print(' ', test.__name__)
    def runTests(self):
        for test in self.testFuncs:
            print('Running test {:10s} ... '.format(test.__name__), end='')
            try:
                test()
                print('passed!')
            except Exception as e:
                print('failed!')
 # Create our test registry
 registry = TestRegistry()
 # Alias our registry to "unitTest"
 # so that we can register tests
 # with a "@unitTest" decorator.
 unitTest = registry
 ```
 %% Cell type:markdown id: tags:
 So we've defined a class, `TestRegistry`, and created an instance of it,
 `registry`, which will manage all of our unit tests. Now, in order to "mark"
 any function as being a unit test, we just need to use the `unitTest`
 decorator (which is simply a reference to our `TestRegistry` instance):
 %% Cell type:code id: tags:
 ``` 
 @unitTest
 def testFoo():
    assert 'a' in 'bcde'
 @unitTest
 def testBar():
    assert 1 > 0
 @unitTest
 def testBlerk():
    assert 9 % 2 == 0
 ```
 %% Cell type:markdown id: tags:
 Now that these functions have been registered with our `TestRegistry`
 instance, we can run them all:
 %% Cell type:code id: tags:
 ``` 
 registry.listTests()
 registry.runTests()
 ```
 %% Cell type:markdown id: tags:
 > Unit testing is something which you must do! This is __especially__
 > important in an interpreted language such as Python, where there is no
 > compiler to catch all of your mistakes.
 >
 > Python has a built-in
 > [`unittest`](https://docs.python.org/3.5/library/unittest.html) module,
 > however the third-party [`pytest`](https://docs.pytest.org/en/latest/) and
 > [`nose`](http://nose2.readthedocs.io/en/latest/) are popular.  It is also
 > wise to combine your unit tests with
 > [`coverage`](https://coverage.readthedocs.io/en/coverage-4.5.1/), which
 > tells you how much of your code was executed, or _covered_ when your
 > tests were run.
 <a class="anchor" id="appendix-functions-are-not-special"></a>
 ## Appendix: Functions are not special
 When we write a statement like this:
 %% Cell type:code id: tags:
 ``` 
 a = [1, 2, 3]
 ```
 %% Cell type:markdown id: tags:
 the variable `a` is a reference to a `list`. We can create a new reference to
 the same list, and delete `a`:
 %% Cell type:code id: tags:
 ``` 
 b = a
 del a
 ```
 %% Cell type:markdown id: tags:
 Deleting `a` doesn't affect the list at all - the list still exists, and is
 now referred to by a variable called `b`.
 %% Cell type:code id: tags:
 ``` 
 print('b: ', b)
 ```
 %% Cell type:markdown id: tags:
 `a` has, however, been deleted:
 %% Cell type:code id: tags:
 ``` 
 print('a: ', a)
 ```
 %% Cell type:markdown id: tags:
 The variables `a` and `b` are just references to a list that is sitting in
 memory somewhere - renaming or removing a reference does not have any effect
 upon the list<sup>2</sup>.
 If you are familiar with C or C++, you can think of a variable in Python as
 like a `void *` pointer - it is just a pointer of an unspecified type, which
 is pointing to some item in memory (which does have a specific type). Deleting
 the pointer does not have any effect upon the item to which it was pointing.
 > <sup>2</sup> Until no more references to the list exist, at which point it
 > will be
 > [garbage-collected](https://www.quora.com/How-does-garbage-collection-in-Python-work-What-are-the-pros-and-cons).
 Now, functions in Python work in _exactly_ the same way as variables.  When we
 define a function like this:
 %% Cell type:code id: tags:
 ``` 
 def inverse(a):
    return npla.inv(a)
 print(inverse)
 ```
 %% Cell type:markdown id: tags:
 there is nothing special about the name `inverse` - `inverse` is just a
 reference to a function that resides somewhere in memory. We can create a new
 reference to this function:
 %% Cell type:code id: tags:
 ``` 
 inv2 = inverse
 ```
 %% Cell type:markdown id: tags:
 And delete the old reference:
 %% Cell type:code id: tags:
 ``` 
 del inverse
 ```
 %% Cell type:markdown id: tags:
 But the function still exists, and is still callable, via our second
 reference:
 %% Cell type:code id: tags:
 ``` 
 print(inv2)
 data    = np.random.random((10, 10))
 invdata = inv2(data)
 ```
 %% Cell type:markdown id: tags:
 So there is nothing special about functions in Python - they are just items
 that reside somewhere in memory, and to which we can create as many references
 as we like.
 > If it bothers you that `print(inv2)` resulted in
 > `<function inverse at ...>`, and not `<function inv2 at ...>`, then refer to
 > the appendix on
 > [preserving function metdata](#appendix-preserving-function-metadata).
 <a class="anchor" id="appendix-closures"></a>
 ## Appendix: Closures
 Whenever we define or use a decorator, we are taking advantage of a concept
 called a [_closure_][wiki-closure]. Take a second to re-familiarise yourself
 with our `memoize` decorator function from earlier - when `memoize` is called,
 it creates and returns a function called `wrapper`:
 [wiki-closure]: https://en.wikipedia.org/wiki/Closure_(computer_programming)
 %% Cell type:code id: tags:
 ``` 
 def memoize(func):
    cache = {}
    def wrapper(*args):
        # is there a value in the cache
        # for this set of inputs?
        cached = cache.get(args, None)
        # If not, call the function,
        # and cache the result.
        if cached is None:
            cached      = func(*args)
            cache[args] = cached
        else:
            print('Cached {}({}): {}'.format(func.__name__, args, cached))
        return cached
    return wrapper
 ```
 %% Cell type:markdown id: tags:
 Then `wrapper` is executed at some arbitrary point in the future. But how does
 it have access to `cache`, defined within the scope of the `memoize` function,
 after the execution of `memoize` has ended?
 %% Cell type:code id: tags:
 ``` 
 def nby2(n):
    return n * 2
 # wrapper function is created here (and
 # assigned back to the nby2 reference)
 nby2 = memoize(nby2)
 # wrapper function is executed here
 print('nby2(2): ', nby2(2))
 print('nby2(2): ', nby2(2))
 ```
 %% Cell type:markdown id: tags:
 The trick is that whenever a nested function is defined in Python, the scope
 in which it is defined is preserved for that function's lifetime. So `wrapper`
 has access to all of the variables within the `memoize` function's scope, that
 were defined at the time that `wrapper` was created (which was when we called
 `memoize`).  This is why `wrapper` is able to access `cache`, even though at
 the time that `wrapper` is called, the execution of `memoize` has long since
 finished.
 This is what is known as a
 [_closure_](https://www.geeksforgeeks.org/python-closures/). Closures are a
 fundamental, and extremely powerful, aspect of Python and other high level
 languages. So there's your answer,
 [fishbulb](https://www.youtube.com/watch?v=CiAaEPcnlOg).
 <a class="anchor" id="appendix-decorators-without-arguments-versus-decorators-with-arguments"></a>
 ## Appendix: Decorators without arguments versus decorators with arguments
 There are three ways to invoke a decorator with the `@` notation:
 1. Naming it, e.g. `@mydecorator`
 2. Calling it, e.g. `@mydecorator()`
 3. Calling it, and passing it arguments, e.g. `@mydecorator(1, 2, 3)`
 Python expects a decorator function to behave differently in the second and
 third scenarios, when compared to the first:
 %% Cell type:code id: tags:
 ``` 
 def decorator(*args):
    print('  decorator({})'.format(args))
    def wrapper(*args):
        print('    wrapper({})'.format(args))
    return wrapper
 print('Scenario #1: @decorator')
 @decorator
 def noop():
    pass
 print('\nScenario #2: @decorator()')
 @decorator()
 def noop():
    pass
 print('\nScenario #3: @decorator(1, 2, 3)')
 @decorator(1, 2, 3)
 def noop():
    pass
 ```
 %% Cell type:markdown id: tags:
 So if a decorator is "named" (scenario 1), only the decorator function
 (`decorator` in the example above) is called, and is passed the decorated
 function.
 But if a decorator function is "called" (scenarios 2 or 3), both the decorator
 function (`decorator`), __and its return value__ (`wrapper`) are called - the
 decorator function is passed the arguments that were provided, and its return
 value is passed the decorated function.
 This is why, if you are writing a decorator function which expects arguments,
 you must use three layers of functions, like so:
 %% Cell type:code id: tags:
 ``` 
 def decorator(*args):
    def realDecorator(func):
        def wrapper(*args, **kwargs):
            return func(*args, **kwargs)
        return wrapper
    return realDecorator
 ```
 %% Cell type:markdown id: tags:
 > The author of this practical is angry about this, as he does not understand
 > why the Python language designers couldn't allow a decorator function to be
 > passed both the decorated function, and any arguments that were passed when
 > the decorator was invoked, like so:
 >
 > ```
 > def decorator(func, *args, **kwargs): # args/kwargs here contain
 >                                       # whatever is passed to the
 >                                       # decorator
 >
 >     def wrapper(*args, **kwargs):     # args/kwargs here contain
 >                                       # whatever is passed to the
 >                                       # decorated function
 >          return func(*args, **kwargs)
 >
 >     return wrapper
 > ```
 <a class="anchor" id="appendix-per-instance-decorators"></a>
 ## Appendix: Per-instance decorators
 In the section on [decorating methods](#decorators-on-methods), you saw
 that when a decorator is applied to a method of a class,  that decorator
 is invoked just once, and shared by all instances of the class. Consider this
 example:
 %% Cell type:code id: tags:
 ``` 
 def decorator(func):
    print('Decorating {} function'.format(func.__name__))
    def wrapper(*args, **kwargs):
        print('Calling decorated function {}'.format(func.__name__))
        return func(*args, **kwargs)
    return wrapper
 class MiscMaths(object):
    @decorator
    def add(self, a, b):
        return a + b
 ```
 %% Cell type:markdown id: tags:
 Note that `decorator` was called at the time that the `MiscMaths` class was
 defined. Now, all `MiscMaths` instances share the same `wrapper` function:
 %% Cell type:code id: tags:
 ``` 
 mm1 = MiscMaths()
 mm2 = MiscMaths()
 print('1 + 2 =', mm1.add(1, 2))
 print('3 + 4 =', mm2.add(3, 4))
 ```
 %% Cell type:markdown id: tags:
 This is not an issue in many cases, but it can be problematic in some. Imagine
 if we have a decorator called `ensureNumeric`, which makes sure that arguments
 passed to a function are numbers:
 %% Cell type:code id: tags:
 ``` 
 def ensureNumeric(func):
    def wrapper(*args):
        args = tuple([float(a) for a in args])
        return func(*args)
    return wrapper
 ```
 %% Cell type:markdown id: tags:
 This all looks well and good - we can use it to decorate a numeric function,
 allowing strings to be passed in as well:
 %% Cell type:code id: tags:
 ``` 
 @ensureNumeric
 def mul(a, b):
    return a * b
 print(mul( 2,   3))
 print(mul('5', '10'))
 ```
 %% Cell type:markdown id: tags:
 But what will happen when we try to decorate a method of a class?
 %% Cell type:code id: tags:
 ``` 
 class MiscMaths(object):
    @ensureNumeric
    def add(self, a, b):
        return a + b
 mm = MiscMaths()
 print(mm.add('5', 10))
 ```
 %% Cell type:markdown id: tags:
 What happened here?? Remember that the first argument passed to any instance
 method is the instance itself (the `self` argument). Well, the `MiscMaths`
 instance was passed to the `wrapper` function, which then tried to convert it
 into a `float`.  So we can't actually apply the `ensureNumeric` function as a
 decorator on a method in this way.
 There are a few potential solutions here. We could modify the `ensureNumeric`
 function, so that the `wrapper` ignores the first argument. But this would
 mean that we couldn't use `ensureNumeric` with standalone functions.
 But we _can_ manually apply the `ensureNumeric` decorator to `MiscMaths`
 instances when they are initialised.  We can't use the nice `@ensureNumeric`
 syntax to apply our decorators, but this is a viable approach:
 %% Cell type:code id: tags:
 ``` 
 class MiscMaths(object):
    def __init__(self):
        self.add = ensureNumeric(self.add)
    def add(self, a, b):
        return a + b
 mm = MiscMaths()
 print(mm.add('5', 10))
 ```
 %% Cell type:markdown id: tags:
 Another approach is to use a second decorator, which dynamically creates the
 real decorator when it is accessed on an instance. This requires the use of an
 advanced Python technique called
 [_descriptors_](https://docs.python.org/3.5/howto/descriptor.html), which is
 beyond the scope of this practical. But if you are interested, you can see an
 implementation of this approach
 [here](https://git.fmrib.ox.ac.uk/fsl/fslpy/blob/1.6.8/fsl/utils/memoize.py#L249).
 <a class="anchor" id="appendix-preserving-function-metadata"></a>
 ## Appendix: Preserving function metadata
 You may have noticed that when we decorate a function, some of its properties
 are lost. Consider this function:
 %% Cell type:code id: tags:
 ``` 
 def add2(a, b):
    """Adds two numbers together."""
    return a + b
 ```
 %% Cell type:markdown id: tags:
 The `add2` function is an object which has some attributes, e.g.:
 %% Cell type:code id: tags:
 ``` 
 print('Name: ', add2.__name__)
 print('Help: ', add2.__doc__)
 ```
 %% Cell type:markdown id: tags:
 However, when we apply a decorator to `add2`:
 %% Cell type:code id: tags:
 ``` 
 def decorator(func):
    def wrapper(*args, **kwargs):
        """Internal wrapper function for decorator."""
        print('Calling decorated function {}'.format(func.__name__))
        return func(*args, **kwargs)
    return wrapper
 @decorator
 def add2(a, b):
    """Adds two numbers together."""
    return a + b
 ```
 %% Cell type:markdown id: tags:
 Those attributes are lost, and instead we get the attributes of the `wrapper`
 function:
 %% Cell type:code id: tags:
 ``` 
 print('Name: ', add2.__name__)
 print('Help: ', add2.__doc__)
 ```
 %% Cell type:markdown id: tags:
 While this may be inconsequential in most situations, it can be quite annoying
 in some, such as when we are automatically [generating
 documentation](http://www.sphinx-doc.org/) for our code.
 Fortunately, there is a workaround, available in the built-in
 [`functools`](https://docs.python.org/3.5/library/functools.html#functools.wraps)
 module:
 %% Cell type:code id: tags:
 ``` 
 import functools
 def decorator(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        """Internal wrapper function for decorator."""
        print('Calling decorated function {}'.format(func.__name__))
        return func(*args, **kwargs)
    return wrapper
 @decorator
 def add2(a, b):
    """Adds two numbers together."""
    return a + b
 ```
 %% Cell type:markdown id: tags:
 We have applied the `@functools.wraps` decorator to our internal `wrapper`
 function - this will essentially replace the `wrapper` function metdata with
 the metadata from our `func` function. So our `add2` name and documentation is
 now preserved:
 %% Cell type:code id: tags:
 ``` 
 print('Name: ', add2.__name__)
 print('Help: ', add2.__doc__)
 ```
 %% Cell type:markdown id: tags:
 <a class="anchor" id="appendix-class-decorators"></a>
 ## Appendix: Class decorators
 > Not to be confused with [_decorator classes_](#decorator-classes)!
 In this practical, we have shown how decorators can be applied to functions
 and methods. But decorators can in fact also be applied to _classes_. This is
 a fairly niche feature that you are probably not likely to need, so we will
 only cover it briefly.
 Imagine that we want all objects in our application to have a globally unique
 (within the application) identifier. We could use a decorator which contains
 the logic for generating unique IDs, and defines the interface that we can
 use on an instance to obtain its ID:
 %% Cell type:code id: tags:
 ``` 
 import random
 allIds = set()
 def uniqueID(cls):
    class subclass(cls):
        def getUniqueID(self):
            uid = getattr(self, '_uid', None)
            if uid is not None:
                return uid
            while uid is None or uid in set():
                uid = random.randint(1, 100)
            self._uid = uid
            return uid
    return subclass
 ```
 %% Cell type:markdown id: tags:
 Now we can use the `@uniqueID` decorator on any class that we need to
 have a unique ID:
 %% Cell type:code id: tags:
 ``` 
 @uniqueID
 class Foo(object):
    pass
 @uniqueID
 class Bar(object):
    pass
 ```
 %% Cell type:markdown id: tags:
 All instances of these classes will have a `getUniqueID` method:
 %% Cell type:code id: tags:
 ``` 
 f1 = Foo()
 f2 = Foo()
 b1 = Bar()
 b2 = Bar()
 print('f1: ', f1.getUniqueID())
 print('f2: ', f2.getUniqueID())
 print('b1: ', b1.getUniqueID())
 print('b2: ', b2.getUniqueID())
 ```
 %% Cell type:markdown id: tags:
 <a class="anchor" id="useful-references"></a>
 ## Useful references
 * [Understanding decorators in 12 easy steps](http://simeonfranklin.com/blog/2012/jul/1/python-decorators-in-12-steps/)
 * [The decorators they won't tell you about](https://github.com/hchasestevens/hchasestevens.github.io/blob/master/notebooks/the-decorators-they-wont-tell-you-about.ipynb)
 * [Closures - Wikipedia][wiki-closure]
 * [Closures in Python](https://www.geeksforgeeks.org/python-closures/)
 * [Garbage collection in Python](https://www.quora.com/How-does-garbage-collection-in-Python-work-What-are-the-pros-and-cons)
 [wiki-closure]: https://en.wikipedia.org/wiki/Closure_(computer_programming)

--- a/advanced_topics/06_decorators.md
+++ b/advanced_topics/06_decorators.md
@@ -383,8 +383,10 @@ def limitedMemoize(maxSize):
                # remove the oldest item. In practice
                # it would make more sense to remove
                # the item with the oldest access
-                # time, but this is good enough for
+                # time (or remove the least recently
-                # an introduction!
+                # used item, as the built-in
+                # @functools.lru_cache does), but this
+                # is good enough for now!
                if len(cache) >= maxSize:
                    cache.popitem(last=False)

--- a/talks/multiprocessing/Multiprocessing.ipynb
+++ b/talks/multiprocessing/Multiprocessing.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![Running in parallel](parallel.png)\n",
+    "\n",
+    "# Multiprocessing and multithreading in Python\n",
+    "\n",
+    "## Why use multiprocessing?\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "4"
+      ]
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import multiprocessing\n",
+    "multiprocessing.cpu_count()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "*Almost all CPUs these days are multi-core.*\n",
+    "\n",
+    "CPU-intensive programs will not be efficient unless they take advantage of this!\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# General plan\n",
+    "\n",
+    "Walk through a basic application of multiprocessing, hopefully relevant to the kind of work you might want to do.\n",
+    "\n",
+    "Not a comprehensive guide to Python multithreading/multiprocessing.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![voxel](voxel.png)\n",
+    "\n",
+    "## Sample application\n",
+    "\n",
+    "Assume we are doing some voxelwise image processing - i.e. running a computationally intensive calculation *independently* on each voxel in a (possibly large) image. \n",
+    "\n",
+    "*(Such problems are sometimes called 'embarrassingly parallel')*\n",
+    "\n",
+    "This is in a Python module called my_analysis. Here we simulate this by just calculating a large number of exponentials for each voxel.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# my_analysis.py\n",
+    "\n",
+    "import math\n",
+    "import numpy\n",
+    "\n",
+    "def calculate_voxel(val):\n",
+    "    # 'Slow' voxelwise calculation\n",
+    "    for i in range(30000):\n",
+    "        b = math.exp(val)\n",
+    "    return b\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We're going to run this on a Numpy array. `numpy.vectorize` is a convenient function to apply a function to every element of the array, but it is *not* doing anything clever - it is no different than looping over the x, y, and z co-ordinates.\n",
+    "\n",
+    "We're also giving the data an ID - this will be used later when we have multiple threads."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def calculate_data(data, id=0):\n",
+    "    # Run 'calculate_voxel' on each voxel in data\n",
+    "    print(\"Id: %i: Processing %i voxels\" % (id, data.size))\n",
+    "    vectorized = numpy.vectorize(calculate_voxel)\n",
+    "    vectorized(data)\n",
+    "    print(\"Id: %i: Done\" % id)\n",
+    "    return data\n",
+    " "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Here's some Python code to run our analysis on a random Numpy array, and time how long it takes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Id: 0: Processing 4096 voxels\n",
+      "Id: 0: Done\n",
+      "Data processing took 26.44 seconds\n"
+     ]
+    }
+   ],
+   "source": [
+    "import numpy\n",
+    "import timeit\n",
+    "\n",
+    "import my_analysis\n",
+    "\n",
+    "def run():\n",
+    "    data = numpy.random.rand(16, 16, 16)\n",
+    "    my_analysis.calculate_data(data)\n",
+    "    \n",
+    "t = timeit.timeit(run, number=1)\n",
+    "print(\"Data processing took %.2f seconds\" % t)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "So, it took a little while."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If we watch what's going on while this runs, we can see the program is not using all of our CPU. It's only working on one core.\n",
+    "\n",
+    "![Running in serial](onecore.png)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## What we want\n",
+    "\n",
+    "It would be nice to split the data up into chunks and give one to each core. Then we could get through the processing 8 times as fast. \n",
+    "\n",
+    "![Running in parallel](multicore.png)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Multithreading attempt\n",
+    "\n",
+    "*Threads* are a way for a program to run more than one task at a time. Let's try using this on our application, using the Python `threading` module. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Splitting the data up\n",
+    "\n",
+    "We're going to need to split the data up into chunks. Numpy has a handy function `numpy.split` which slices the array up into equal portions along a specified axis:\n",
+    "\n",
+    "    chunks = numpy.split(full_data, num_chunks, axis)\n",
+    "\n",
+    "*The data must split up equally along this axis! We will solve this problem later*"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Creating a new thread for each chunk\n",
+    "\n",
+    "    def function_to_call(args, arg2, arg3):\n",
+    "       ...do something\n",
+    "     \n",
+    "    ...\n",
+    "    \n",
+    "    import threading\n",
+    "    thread = threading.Thread(target=function_to_call, \n",
+    "                              args=[arg1, arg2, arg3])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Waiting for the threads to complete\n",
+    "\n",
+    "    thread.join()\n",
+    "\n",
+    "- This waits until `thread` has completed\n",
+    "- So, if we have more than one thread we need to keep a list and wait for them all to finish:\n",
+    "\n",
+    "\n",
+    "    for thread in threads:\n",
+    "        thread.join()\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example code\n",
+    "\n",
+    "The example code is below - let's see how it does!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Starting worker for part 0\n",
+      "Starting worker for part 1\n",
+      " Id: 0: Processing 1024 voxels\n",
+      "Id: 1: Processing 1024 voxels\n",
+      "Starting worker for part 2Id: 2: Processing 1024 voxels\n",
+      "\n",
+      "Starting worker for part 3Id: 3: Processing 1024 voxels\n",
+      "\n",
+      "Id: 1: DoneId: 0: Done\n",
+      "\n",
+      "Id: 2: Done\n",
+      "Id: 3: Done\n",
+      "Data processing took 132.90 seconds\n"
+     ]
+    }
+   ],
+   "source": [
+    "import threading\n",
+    "\n",
+    "def multithread_process(data):\n",
+    "    n_workers = 4\n",
+    "    \n",
+    "    # Split the data into chunks along axis 0\n",
+    "    # We are assuming this axis is divisible by the number of workers!\n",
+    "    chunks = numpy.split(data, n_workers, axis=0)\n",
+    "    \n",
+    "    # Start a worker for each chunk\n",
+    "    workers = []\n",
+    "    for idx, chunk in enumerate(chunks):\n",
+    "        print(\"Starting worker for part %i\" % idx)\n",
+    "        w = threading.Thread(target=my_analysis.calculate_data, args=[chunk, idx])\n",
+    "        workers.append(w)\n",
+    "        w.start()\n",
+    "    \n",
+    "    # Wait for each worker to finish\n",
+    "    for w in workers:\n",
+    "        w.join()\n",
+    "        \n",
+    "def run_with_threads():\n",
+    "    data = numpy.random.rand(16, 16, 16)\n",
+    "    multithread_process(data)\n",
+    "    \n",
+    "t = timeit.timeit(run_with_threads, number=1)\n",
+    "print(\"Data processing took %.2f seconds\" % t)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# The Big Problem with Python threads"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Only one thread can execute Python code at a time**\n",
+    "\n",
+    "![Python multithreading](thread_gil.png)\n",
+    "\n",
+    "This is what's really going on.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The reason is something called the **Global Interpreter Lock (GIL)**. Only one thread can have it, and you can only execute Python code when you have the GIL."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## So, does that mean Python threads are useless?\n",
+    "\n",
+    "No, not completely. They're useful for:\n",
+    "\n",
+    "- Making a user interface continue to respond while a calculation takes place in the background\n",
+    "- A web server handling multiple requests.\n",
+    "  - *The GIL is not required while waiting for network connections*\n",
+    "- Doing calculations in parallel which are running in native (C/C++) code\n",
+    "  - *The GIL is not required while running native code*\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "### But for doing CPU-intensive Python calculations in parallel, yes Python threads are essentially useless\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Can multiprocessing help?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Differences between threads and processes\n",
+    "\n",
+    "- Threads are quicker to start up and generally require fewer resources\n",
+    "\n",
+    "- Threads share memory with the main process \n",
+    "  - Don't need to copy your data to pass it to a thread\n",
+    "  - Don't need to copy the output data back to the main program\n",
+    "  \n",
+    "- Processes have their own memory space \n",
+    "  - Data needs to be copied from the main program to the process\n",
+    "  - Any output needs to be copied back\n",
+    "  \n",
+    "- However, importantly for Python, *Each process has its own GIL so they can run at the same time as others*"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Multiprocessing attempt\n",
+    "\n",
+    "Multiprocessing is normally more work than multithreading.\n",
+    "\n",
+    "However Python tries *very hard* to make multiprocessing as easy as multithreading.\n",
+    "\n",
+    "- `import multiprocessing` instead of `import threading`\n",
+    "- `multiprocessing.Process()` instead of `threading.Thread()`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Starting worker for chunk 0\n",
+      "Starting worker for chunk 1\n",
+      "Starting worker for chunk 2\n",
+      "Starting worker for chunk 3\n",
+      "Data processing took 9.74 seconds\n"
+     ]
+    }
+   ],
+   "source": [
+    "import multiprocessing\n",
+    "    \n",
+    "def multiprocess_process(data):\n",
+    "    n_workers = 4\n",
+    "    \n",
+    "    # Split the data into chunks along axis 0\n",
+    "    # We are assuming this axis is divisible by the number of workers!\n",
+    "    chunks = numpy.split(data, n_workers, axis=0)\n",
+    "    \n",
+    "    workers = []\n",
+    "    for idx, chunk in enumerate(chunks):\n",
+    "        print(\"Starting worker for chunk %i\" % idx)\n",
+    "        w = multiprocessing.Process(target=my_analysis.calculate_data, args=[chunk, idx])\n",
+    "        workers.append(w)\n",
+    "        w.start()\n",
+    "    \n",
+    "    # Wait for workers to complete\n",
+    "    for w in workers:\n",
+    "        w.join()\n",
+    "        \n",
+    "def run_with_processes():\n",
+    "    data = numpy.random.rand(16, 16, 16)\n",
+    "    multiprocess_process(data)\n",
+    "\n",
+    "if __name__ == \"__main__\":\n",
+    "    t = timeit.timeit(run_with_processes, number=1)\n",
+    "    print(\"Data processing took %.2f seconds\" % t)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Multiprocessing works!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## BUT\n",
+    "\n",
+    "# Caveats and gotchas\n",
+    "\n",
+    "Before we just run off and replace all our threads with processes there are a few things we need to bear in mind:\n",
+    "\n",
+    "## Data copying\n",
+    "\n",
+    "- Python *copied* each chunk of data to each worker. If the data was very large this could be a significant overhead\n",
+    "- Python needs to know *how* to copy all the data we pass to the process. \n",
+    "  - This is fine for normal data types (strings, lists, dictionaries, etc) and Numpy arrays\n",
+    "  - Can get trouble if you try to pass complex objects to your function \n",
+    "  - Anything you pass to the worker needs to be support the `pickle` module\n",
+    "\n",
+    "## The global variable problem\n",
+    "\n",
+    "- Can't rely on global variables being copied\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## The output problem\n",
+    "\n",
+    "If you change data in a subprocess, your main program will not see it.\n",
+    "\n",
+    "## Example:\n",
+    "  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# my_analysis.py\n",
+    "def add_one(data):\n",
+    "    data += 1\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Starting with zeros\n",
+      "[[ 0.  0.]\n",
+      " [ 0.  0.]]\n",
+      "I think my worker just added one\n",
+      "[[ 0.  0.]\n",
+      " [ 0.  0.]]\n"
+     ]
+    }
+   ],
+   "source": [
+    "data = numpy.zeros([2, 2])\n",
+    "print(\"Starting with zeros\")\n",
+    "print(data)\n",
+    "\n",
+    "#my_analysis.add_one(data)\n",
+    "#w = threading.Thread(target=my_analysis.add_one, args=[data,])\n",
+    "w = multiprocessing.Process(target=my_analysis.add_one, args=[data,])\n",
+    "w.start()\n",
+    "w.join()\n",
+    "\n",
+    "print(\"I think my worker just added one\")\n",
+    "print(data)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "# Making multiprocessing work better"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Problems to solve:\n",
+    "\n",
+    "- Dividing data amongst processes\n",
+    "- Returning data from process\n",
+    "- Status updates from process\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Worker pools\n",
+    "\n",
+    "A *Pool* is a fixed number of worker processes which can be given tasks to do. \n",
+    "\n",
+    "We can give a pool as many tasks as we like - once a worker finishes a task it will start another, until they're all done.\n",
+    "\n",
+    "We can create a pool using:\n",
+    "\n",
+    "    multiprocessing.Pool(num_workers)\n",
+    "    \n",
+    "- If the number of workers in the pool is equal to the number of cores, we should be able to keep our CPU busy.\n",
+    "- Pools are good for load balancing if some slices are more work than others\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![Worker pool](pool.png)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Splitting our data into chunks for the pool\n",
+    "\n",
+    " - Now we can split our data up into as many chunks as we like\n",
+    " - Easiest solution is to use 1-voxel slices along one of the axes `split_axis`:\n",
+    "   - `numpy.split(data, data.shape[split_axis], axis=split_axis)`\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Giving the tasks to the pool\n",
+    "\n",
+    " - Easiest way is to use:\n",
+    " \n",
+    " \n",
+    "    Pool.map(function, task_args)\n",
+    "     \n",
+    " - task_args is a *sequence of sequences*\n",
+    "   - Each element in `task_args` is a sequence of arguments for one task\n",
+    "   - The length of `task_args` is the number of tasks\n",
+    "   \n",
+    "#### Example `task_args` for 5 tasks, each being passed an ID and a chunk of data\n",
+    "\n",
+    "    [ \n",
+    "       [0, chunk_1], # task 1\n",
+    "       [1, chunk_2], # task_2\n",
+    "       [2, chunk_3], # task 3\n",
+    "       [3, chunk_4], # task_4\n",
+    "       [4, chunk_5], # task_5\n",
+    "    ]\n",
+    " \n",
+    " - If we have a list of chunks we can generate this with `enumerate(chunks)`\n",
+    " \n",
+    " \n",
+    " - **Arguments are passed to the task in a slightly different way compared to `multiprocessing.Process()`** \n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# my_analysis.py\n",
+    "\n",
+    "def do_task(args):\n",
+    "    # Pool.map passes all our arguments as a single tuple, so unpack it\n",
+    "    # and pass the arguments to the real calculate function.\n",
+    "    id, data = args\n",
+    "    return calculate_data(data, id)\n",
+    "   "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    " - If you're using Python 3, look into `Pool.starmap`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Getting the output and putting it back together\n",
+    "\n",
+    " - `Pool.map()` captures the return value of your worker function\n",
+    " - It returns a list of all of the return values for each task \n",
+    "   - for us this is a list of Numpy arrays, one for each slice\n",
+    " - `numpy.concatenate(list_of_slices, split_axis)` will combine them back into a single data item for us\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## The full example"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Input data shape=(23L, 13L, 11L)\n",
+      "Processing 23 parts with 4 workers\n",
+      "Processed data, output shape=(23L, 13L, 11L)\n",
+      "Data processing took 8.08 seconds\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Split our data along the x-axis\n",
+    "SPLIT_AXIS = 0\n",
+    "\n",
+    "def pool_process(data):\n",
+    "    n_workers = 4\n",
+    "    \n",
+    "    # Split the data into 1-voxel slices along axis 0\n",
+    "    parts = numpy.split(data, data.shape[SPLIT_AXIS], axis=SPLIT_AXIS)\n",
+    "\n",
+    "    print(\"Input data shape=%s\" % str(data.shape))\n",
+    "    print(\"Processing %i parts with %i workers\" % (len(parts), n_workers))\n",
+    "    \n",
+    "    # Create a pool - normally this would be 1 worker per CPU core\n",
+    "    pool = multiprocessing.Pool(n_workers)\n",
+    "    \n",
+    "    # Send the tasks to the workers\n",
+    "    list_of_slices = pool.map(my_analysis.do_task, enumerate(parts))\n",
+    "    \n",
+    "    # Combine the return data back into a single array\n",
+    "    processed_array = numpy.concatenate(list_of_slices, SPLIT_AXIS)\n",
+    "    \n",
+    "    print(\"Processed data, output shape=%s\" % str(processed_array.shape))\n",
+    "    \n",
+    "def run_with_pool():\n",
+    "    data = numpy.random.rand(23, 13, 11)\n",
+    "    pool_process(data)\n",
+    " \n",
+    "t = timeit.timeit(run_with_pool, number=1)\n",
+    "print(\"Data processing took %.2f seconds\" % t)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Communication / Status updates?\n",
+    "\n",
+    " - Would be nice if workers could communicate their progress as they work. One way to do this is using a `Queue`.\n",
+    " "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "## Queues\n",
+    "\n",
+    "A Queue is often used to send status updates from the process to the main program.\n",
+    "\n",
+    " - Shared between the main program and the subprocesses\n",
+    " - Create it with `multiprocessing.Manager().Queue()`\n",
+    " - Pass it to the worker thread like any other argument\n",
+    " - Worker calls `queue.put()` to send some data to the queue\n",
+    " - Main program calls `queue.get()` to get data off the queue\n",
+    " - Queue is FIFO (First In First Out)\n",
+    " - Need to have a thread running which checks the queue for updates every so often\n",
+    "   - This is a good use for threads!\n",
+    " \n",
+    "![Queue](queue_put.png)\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Modify our example to report progress"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# my_analysis.py \n",
+    "\n",
+    "def calculate_data_and_report(args):\n",
+    "    id, data, queue = args\n",
+    "    \n",
+    "    # Run 'calculate_voxel' on each voxel in data\n",
+    "    vectorized = numpy.vectorize(calculate_voxel)\n",
+    "    vectorized(data)\n",
+    "    \n",
+    "    # Report our ID and how many voxels we have done to the queue\n",
+    "    queue.put((id, data.size))\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create a thread to monitor the queue for updates\n",
+    "\n",
+    "I've done this as a class, because it that's the easiest way"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "class QueueChecker():\n",
+    "    def __init__(self, queue, num_voxels, interval_seconds=1):\n",
+    "        self._queue = queue\n",
+    "        self._num_voxels = num_voxels\n",
+    "        self._interval_seconds = interval_seconds\n",
+    "        self._voxels_done = 0\n",
+    "        self._cancel = False\n",
+    "        self._restart_timer()\n",
+    "        \n",
+    "    def cancel(self):\n",
+    "        self._cancel = True\n",
+    "        \n",
+    "    def _restart_timer(self):\n",
+    "        self._timer = threading.Timer(self._interval_seconds, self._check_queue)\n",
+    "        self._timer.start()\n",
+    "\n",
+    "    def _check_queue(self):\n",
+    "        while not self._queue.empty():\n",
+    "            id, voxels_done = self._queue.get()\n",
+    "            self._voxels_done += voxels_done\n",
+    "            \n",
+    "        percent = int(100*float(self._voxels_done)/self._num_voxels)\n",
+    "        print(\"%i%% complete\" % percent)\n",
+    "        if not self._cancel:\n",
+    "            self._restart_timer()\n",
+    "     "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Modify our main program to pass the queue to each of our workers\n",
+    "\n",
+    "We need to create the queue and the `QueueChecker` and make sure each task includes a copy of the queue\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Input data shape=(23L, 19L, 17L)\n",
+      "We are processing 23 parts with 4 workers\n",
+      "0% complete\n",
+      "0% complete\n",
+      "13% complete\n",
+      "17% complete\n",
+      "17% complete\n",
+      "34% complete\n",
+      "34% complete\n",
+      "47% complete\n",
+      "52% complete\n",
+      "56% complete\n",
+      "69% complete\n",
+      "69% complete\n",
+      "78% complete\n",
+      "86% complete\n",
+      "95% complete\n",
+      "Processed data, output shape=(23L, 19L, 17L)\n",
+      "Data processing took 17.39 seconds\n",
+      "100% complete\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Split our data along the x-axis\n",
+    "SPLIT_AXIS = 0\n",
+    "reload(my_analysis)\n",
+    "\n",
+    "def pool_process(data):\n",
+    "    n_workers = 4\n",
+    "    \n",
+    "    # Split the data into 1-voxel slices along axis 0\n",
+    "    parts = numpy.split(data, data.shape[SPLIT_AXIS], axis=SPLIT_AXIS)\n",
+    "\n",
+    "    print(\"Input data shape=%s\" % str(data.shape))\n",
+    "    print(\"We are processing %i parts with %i workers\" % (len(parts), n_workers))\n",
+    "    pool = multiprocessing.Pool(n_workers)\n",
+    "    \n",
+    "    # Create the queue\n",
+    "    queue = multiprocessing.Manager().Queue()\n",
+    "    checker = QueueChecker(queue, data.size)\n",
+    "    \n",
+    "    # Note that we need to pass the queue as an argument to the worker\n",
+    "    args = [(id, part, queue) for id, part in enumerate(parts)]\n",
+    "    list_of_slices = pool.map(my_analysis.calculate_data_and_report, args)\n",
+    "    \n",
+    "    checker.cancel()\n",
+    "    \n",
+    "    # Join processed data back together again\n",
+    "    processed_array = numpy.concatenate(list_of_slices, SPLIT_AXIS)\n",
+    "    print(\"Processed data, output shape=%s\" % str(processed_array.shape))\n",
+    "        \n",
+    "def run_with_pool():\n",
+    "    data = numpy.random.rand(23, 19, 17)\n",
+    "    pool_process(data)\n",
+    "\n",
+    "t = timeit.timeit(run_with_pool, number=1)\n",
+    "print(\"Data processing took %.2f seconds\" % t)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Summary"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## What we've covered\n",
+    "\n",
+    " - Limitations of threading for parallel processing in Python\n",
+    " - How to split up a simple voxel-processing task into separate chunks\n",
+    "   - `numpy.split()`\n",
+    " - How to run each chunk in parallel using multiprocessing\n",
+    "   - `multiprocessing.Process`\n",
+    " - How to separate the number of tasks from the number of workers \n",
+    "   - `multiprocessing.Pool()`\n",
+    "   - `Pool.map()`\n",
+    " - How to get output back from the workers and join it back together again\n",
+    "   - `numpy.concatenate()`\n",
+    " - How to pass back progress information from our worker processes\n",
+    "   - `multiprocessing.manager.Queue()`\n",
+    " - Using a threading.Timer object to monitor the queue and display updates"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Things I haven't covered\n",
+    "\n",
+    "Loads of stuff!\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Threading\n",
+    "\n",
+    "- Locking of shared data (so only one thread can use it at a time)\n",
+    "- Thread-local storage (see `threading.local()`)\n",
+    "- See Paul's tutorial on the PyTreat GIT for more information\n",
+    "- Or see the `threading` Python documentation for full details"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Multiprocessing\n",
+    "\n",
+    "- Passing data *between* workers\n",
+    "  - Can use `Queue` for one-way traffic\n",
+    "  - Use `Pipe` for two-way communication between one worker and another\n",
+    "  - May be required when your problem is not 'embarrasingly parallel'\n",
+    "- Sharing memory\n",
+    "  - Way to avoid copying large amounts of data\n",
+    "  - Look at `multiprocessing.Array`\n",
+    "  - Need to convert Numpy array into a ctypes array\n",
+    "  - Shared memory has pitfalls\n",
+    "  - *Don't go here unless you have aready determined that data copying is a bottleneck*\n",
+    "- Running workers asynchronously\n",
+    "  - So main program doesn't have to wait for them to finish\n",
+    "  - Instead, a function is called every time a task is finished\n",
+    "  - see `multiprocessing.apply_async()` for more information\n",
+    "- Error handling\n",
+    "  - Needs a bit of care - very easy to 'lose' errors\n",
+    "  - Workers should catch all exceptions\n",
+    "  - And should return a value to signal when a task has failed\n",
+    "  - Main program decides how to deal with it\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "## Always remember\n",
+    "\n",
+    "**Python is not the best tool for every job!**\n",
+    "\n",
+    "If you are really after performance, consider implementing your algorithm in multi-threaded C/C++ and then create a Python interface.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 2",
+   "language": "python",
+   "name": "python2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 2
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython2",
+   "version": "2.7.14"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
+%% Cell type:markdown id: tags:
+![Running in parallel](parallel.png)
+# Multiprocessing and multithreading in Python
+## Why use multiprocessing?
+%% Cell type:code id: tags:
+``` python
+import multiprocessing
+multiprocessing.cpu_count()
+```
+%% Output
+    4
+%% Cell type:markdown id: tags:
+*Almost all CPUs these days are multi-core.*
+CPU-intensive programs will not be efficient unless they take advantage of this!
+%% Cell type:markdown id: tags:
+# General plan
+Walk through a basic application of multiprocessing, hopefully relevant to the kind of work you might want to do.
+Not a comprehensive guide to Python multithreading/multiprocessing.
+%% Cell type:markdown id: tags:
+![voxel](voxel.png)
+## Sample application
+Assume we are doing some voxelwise image processing - i.e. running a computationally intensive calculation *independently* on each voxel in a (possibly large) image.
+*(Such problems are sometimes called 'embarrassingly parallel')*
+This is in a Python module called my_analysis. Here we simulate this by just calculating a large number of exponentials for each voxel.
+%% Cell type:code id: tags:
+``` python
+# my_analysis.py
+import math
+import numpy
+def calculate_voxel(val):
+    # 'Slow' voxelwise calculation
+    for i in range(30000):
+        b = math.exp(val)
+    return b
+```
+%% Cell type:markdown id: tags:
+We're going to run this on a Numpy array. `numpy.vectorize` is a convenient function to apply a function to every element of the array, but it is *not* doing anything clever - it is no different than looping over the x, y, and z co-ordinates.
+We're also giving the data an ID - this will be used later when we have multiple threads.
+%% Cell type:code id: tags:
+``` python
+def calculate_data(data, id=0):
+    # Run 'calculate_voxel' on each voxel in data
+    print("Id: %i: Processing %i voxels" % (id, data.size))
+    vectorized = numpy.vectorize(calculate_voxel)
+    vectorized(data)
+    print("Id: %i: Done" % id)
+    return data
+```
+%% Cell type:markdown id: tags:
+Here's some Python code to run our analysis on a random Numpy array, and time how long it takes
+%% Cell type:code id: tags:
+``` python
+import numpy
+import timeit
+import my_analysis
+def run():
+    data = numpy.random.rand(16, 16, 16)
+    my_analysis.calculate_data(data)
+t = timeit.timeit(run, number=1)
+print("Data processing took %.2f seconds" % t)
+```
+%% Output
+    Id: 0: Processing 4096 voxels
+    Id: 0: Done
+    Data processing took 26.44 seconds
+%% Cell type:markdown id: tags:
+So, it took a little while.
+%% Cell type:markdown id: tags:
+If we watch what's going on while this runs, we can see the program is not using all of our CPU. It's only working on one core.
+![Running in serial](onecore.png)
+%% Cell type:markdown id: tags:
+## What we want
+It would be nice to split the data up into chunks and give one to each core. Then we could get through the processing 8 times as fast.
+![Running in parallel](multicore.png)
+%% Cell type:markdown id: tags:
+# Multithreading attempt
+*Threads* are a way for a program to run more than one task at a time. Let's try using this on our application, using the Python `threading` module.
+%% Cell type:markdown id: tags:
+## Splitting the data up
+We're going to need to split the data up into chunks. Numpy has a handy function `numpy.split` which slices the array up into equal portions along a specified axis:
+    chunks = numpy.split(full_data, num_chunks, axis)
+*The data must split up equally along this axis! We will solve this problem later*
+%% Cell type:markdown id: tags:
+## Creating a new thread for each chunk
+    def function_to_call(args, arg2, arg3):
+       ...do something
+    ...
+    import threading
+    thread = threading.Thread(target=function_to_call,
+                              args=[arg1, arg2, arg3])
+%% Cell type:markdown id: tags:
+## Waiting for the threads to complete
+    thread.join()
+- This waits until `thread` has completed
+- So, if we have more than one thread we need to keep a list and wait for them all to finish:
+    for thread in threads:
+        thread.join()
+%% Cell type:markdown id: tags:
+## Example code
+The example code is below - let's see how it does!
+%% Cell type:code id: tags:
+``` python
+import threading
+def multithread_process(data):
+    n_workers = 4
+    # Split the data into chunks along axis 0
+    # We are assuming this axis is divisible by the number of workers!
+    chunks = numpy.split(data, n_workers, axis=0)
+    # Start a worker for each chunk
+    workers = []
+    for idx, chunk in enumerate(chunks):
+        print("Starting worker for part %i" % idx)
+        w = threading.Thread(target=my_analysis.calculate_data, args=[chunk, idx])
+        workers.append(w)
+        w.start()
+    # Wait for each worker to finish
+    for w in workers:
+        w.join()
+def run_with_threads():
+    data = numpy.random.rand(16, 16, 16)
+    multithread_process(data)
+t = timeit.timeit(run_with_threads, number=1)
+print("Data processing took %.2f seconds" % t)
+```
+%% Output
+    Starting worker for part 0
+    Starting worker for part 1
+     Id: 0: Processing 1024 voxels
+    Id: 1: Processing 1024 voxels
+    Starting worker for part 2Id: 2: Processing 1024 voxels
+    Starting worker for part 3Id: 3: Processing 1024 voxels
+    Id: 1: DoneId: 0: Done
+    Id: 2: Done
+    Id: 3: Done
+    Data processing took 132.90 seconds
+%% Cell type:markdown id: tags:
+# The Big Problem with Python threads
+%% Cell type:markdown id: tags:
+**Only one thread can execute Python code at a time**
+![Python multithreading](thread_gil.png)
+This is what's really going on.
+%% Cell type:markdown id: tags:
+The reason is something called the **Global Interpreter Lock (GIL)**. Only one thread can have it, and you can only execute Python code when you have the GIL.
+%% Cell type:markdown id: tags:
+## So, does that mean Python threads are useless?
+No, not completely. They're useful for:
+- Making a user interface continue to respond while a calculation takes place in the background
+- A web server handling multiple requests.
+  - *The GIL is not required while waiting for network connections*
+- Doing calculations in parallel which are running in native (C/C++) code
+  - *The GIL is not required while running native code*
+%% Cell type:markdown id: tags:
+### But for doing CPU-intensive Python calculations in parallel, yes Python threads are essentially useless
+%% Cell type:markdown id: tags:
+## Can multiprocessing help?
+%% Cell type:markdown id: tags:
+### Differences between threads and processes
+- Threads are quicker to start up and generally require fewer resources
+- Threads share memory with the main process
+  - Don't need to copy your data to pass it to a thread
+  - Don't need to copy the output data back to the main program
+- Processes have their own memory space
+  - Data needs to be copied from the main program to the process
+  - Any output needs to be copied back
+- However, importantly for Python, *Each process has its own GIL so they can run at the same time as others*
+%% Cell type:markdown id: tags:
+## Multiprocessing attempt
+Multiprocessing is normally more work than multithreading.
+However Python tries *very hard* to make multiprocessing as easy as multithreading.
+- `import multiprocessing` instead of `import threading`
+- `multiprocessing.Process()` instead of `threading.Thread()`
+%% Cell type:code id: tags:
+``` python
+import multiprocessing
+def multiprocess_process(data):
+    n_workers = 4
+    # Split the data into chunks along axis 0
+    # We are assuming this axis is divisible by the number of workers!
+    chunks = numpy.split(data, n_workers, axis=0)
+    workers = []
+    for idx, chunk in enumerate(chunks):
+        print("Starting worker for chunk %i" % idx)
+        w = multiprocessing.Process(target=my_analysis.calculate_data, args=[chunk, idx])
+        workers.append(w)
+        w.start()
+    # Wait for workers to complete
+    for w in workers:
+        w.join()
+def run_with_processes():
+    data = numpy.random.rand(16, 16, 16)
+    multiprocess_process(data)
+if __name__ == "__main__":
+    t = timeit.timeit(run_with_processes, number=1)
+    print("Data processing took %.2f seconds" % t)
+```
+%% Output
+    Starting worker for chunk 0
+    Starting worker for chunk 1
+    Starting worker for chunk 2
+    Starting worker for chunk 3
+    Data processing took 9.74 seconds
+%% Cell type:markdown id: tags:
+# Multiprocessing works!
+%% Cell type:markdown id: tags:
+## BUT
+# Caveats and gotchas
+Before we just run off and replace all our threads with processes there are a few things we need to bear in mind:
+## Data copying
+- Python *copied* each chunk of data to each worker. If the data was very large this could be a significant overhead
+- Python needs to know *how* to copy all the data we pass to the process.
+  - This is fine for normal data types (strings, lists, dictionaries, etc) and Numpy arrays
+  - Can get trouble if you try to pass complex objects to your function
+  - Anything you pass to the worker needs to be support the `pickle` module
+## The global variable problem
+- Can't rely on global variables being copied
+%% Cell type:markdown id: tags:
+## The output problem
+If you change data in a subprocess, your main program will not see it.
+## Example:
+%% Cell type:code id: tags:
+``` python
+# my_analysis.py
+def add_one(data):
+    data += 1
+```
+%% Cell type:code id: tags:
+``` python
+data = numpy.zeros([2, 2])
+print("Starting with zeros")
+print(data)
+#my_analysis.add_one(data)
+#w = threading.Thread(target=my_analysis.add_one, args=[data,])
+w = multiprocessing.Process(target=my_analysis.add_one, args=[data,])
+w.start()
+w.join()
+print("I think my worker just added one")
+print(data)
+```
+%% Output
+    Starting with zeros
+    [[ 0.  0.]
+     [ 0.  0.]]
+    I think my worker just added one
+    [[ 0.  0.]
+     [ 0.  0.]]
+%% Cell type:markdown id: tags:
+# Making multiprocessing work better
+%% Cell type:markdown id: tags:
+## Problems to solve:
+- Dividing data amongst processes
+- Returning data from process
+- Status updates from process
+%% Cell type:markdown id: tags:
+## Worker pools
+A *Pool* is a fixed number of worker processes which can be given tasks to do.
+We can give a pool as many tasks as we like - once a worker finishes a task it will start another, until they're all done.
+We can create a pool using:
+    multiprocessing.Pool(num_workers)
+- If the number of workers in the pool is equal to the number of cores, we should be able to keep our CPU busy.
+- Pools are good for load balancing if some slices are more work than others
+%% Cell type:markdown id: tags:
+![Worker pool](pool.png)
+%% Cell type:markdown id: tags:
+## Splitting our data into chunks for the pool
+ - Now we can split our data up into as many chunks as we like
+ - Easiest solution is to use 1-voxel slices along one of the axes `split_axis`:
+   - `numpy.split(data, data.shape[split_axis], axis=split_axis)`
+%% Cell type:markdown id: tags:
+## Giving the tasks to the pool
+ - Easiest way is to use:
+    Pool.map(function, task_args)
+ - task_args is a *sequence of sequences*
+   - Each element in `task_args` is a sequence of arguments for one task
+   - The length of `task_args` is the number of tasks
+#### Example `task_args` for 5 tasks, each being passed an ID and a chunk of data
+    [
+       [0, chunk_1], # task 1
+       [1, chunk_2], # task_2
+       [2, chunk_3], # task 3
+       [3, chunk_4], # task_4
+       [4, chunk_5], # task_5
+    ]
+ - If we have a list of chunks we can generate this with `enumerate(chunks)`
+ - **Arguments are passed to the task in a slightly different way compared to `multiprocessing.Process()`**
+%% Cell type:code id: tags:
+``` python
+# my_analysis.py
+def do_task(args):
+    # Pool.map passes all our arguments as a single tuple, so unpack it
+    # and pass the arguments to the real calculate function.
+    id, data = args
+    return calculate_data(data, id)
+```
+%% Cell type:markdown id: tags:
+ - If you're using Python 3, look into `Pool.starmap`
+%% Cell type:markdown id: tags:
+## Getting the output and putting it back together
+ - `Pool.map()` captures the return value of your worker function
+ - It returns a list of all of the return values for each task
+   - for us this is a list of Numpy arrays, one for each slice
+ - `numpy.concatenate(list_of_slices, split_axis)` will combine them back into a single data item for us
+%% Cell type:markdown id: tags:
+## The full example
+%% Cell type:code id: tags:
+``` python
+# Split our data along the x-axis
+SPLIT_AXIS = 0
+def pool_process(data):
+    n_workers = 4
+    # Split the data into 1-voxel slices along axis 0
+    parts = numpy.split(data, data.shape[SPLIT_AXIS], axis=SPLIT_AXIS)
+    print("Input data shape=%s" % str(data.shape))
+    print("Processing %i parts with %i workers" % (len(parts), n_workers))
+    # Create a pool - normally this would be 1 worker per CPU core
+    pool = multiprocessing.Pool(n_workers)
+    # Send the tasks to the workers
+    list_of_slices = pool.map(my_analysis.do_task, enumerate(parts))
+    # Combine the return data back into a single array
+    processed_array = numpy.concatenate(list_of_slices, SPLIT_AXIS)
+    print("Processed data, output shape=%s" % str(processed_array.shape))
+def run_with_pool():
+    data = numpy.random.rand(23, 13, 11)
+    pool_process(data)
+t = timeit.timeit(run_with_pool, number=1)
+print("Data processing took %.2f seconds" % t)
+```
+%% Output
+    Input data shape=(23L, 13L, 11L)
+    Processing 23 parts with 4 workers
+    Processed data, output shape=(23L, 13L, 11L)
+    Data processing took 8.08 seconds
+%% Cell type:markdown id: tags:
+# Communication / Status updates?
+ - Would be nice if workers could communicate their progress as they work. One way to do this is using a `Queue`.
+%% Cell type:markdown id: tags:
+## Queues
+A Queue is often used to send status updates from the process to the main program.
+ - Shared between the main program and the subprocesses
+ - Create it with `multiprocessing.Manager().Queue()`
+ - Pass it to the worker thread like any other argument
+ - Worker calls `queue.put()` to send some data to the queue
+ - Main program calls `queue.get()` to get data off the queue
+ - Queue is FIFO (First In First Out)
+ - Need to have a thread running which checks the queue for updates every so often
+   - This is a good use for threads!
+![Queue](queue_put.png)
+%% Cell type:markdown id: tags:
+## Modify our example to report progress
+%% Cell type:code id: tags:
+``` python
+# my_analysis.py
+def calculate_data_and_report(args):
+    id, data, queue = args
+    # Run 'calculate_voxel' on each voxel in data
+    vectorized = numpy.vectorize(calculate_voxel)
+    vectorized(data)
+    # Report our ID and how many voxels we have done to the queue
+    queue.put((id, data.size))
+```
+%% Cell type:markdown id: tags:
+## Create a thread to monitor the queue for updates
+I've done this as a class, because it that's the easiest way
+%% Cell type:code id: tags:
+``` python
+class QueueChecker():
+    def __init__(self, queue, num_voxels, interval_seconds=1):
+        self._queue = queue
+        self._num_voxels = num_voxels
+        self._interval_seconds = interval_seconds
+        self._voxels_done = 0
+        self._cancel = False
+        self._restart_timer()
+    def cancel(self):
+        self._cancel = True
+    def _restart_timer(self):
+        self._timer = threading.Timer(self._interval_seconds, self._check_queue)
+        self._timer.start()
+    def _check_queue(self):
+        while not self._queue.empty():
+            id, voxels_done = self._queue.get()
+            self._voxels_done += voxels_done
+        percent = int(100*float(self._voxels_done)/self._num_voxels)
+        print("%i%% complete" % percent)
+        if not self._cancel:
+            self._restart_timer()
+```
+%% Cell type:markdown id: tags:
+## Modify our main program to pass the queue to each of our workers
+We need to create the queue and the `QueueChecker` and make sure each task includes a copy of the queue
+%% Cell type:code id: tags:
+``` python
+# Split our data along the x-axis
+SPLIT_AXIS = 0
+reload(my_analysis)
+def pool_process(data):
+    n_workers = 4
+    # Split the data into 1-voxel slices along axis 0
+    parts = numpy.split(data, data.shape[SPLIT_AXIS], axis=SPLIT_AXIS)
+    print("Input data shape=%s" % str(data.shape))
+    print("We are processing %i parts with %i workers" % (len(parts), n_workers))
+    pool = multiprocessing.Pool(n_workers)
+    # Create the queue
+    queue = multiprocessing.Manager().Queue()
+    checker = QueueChecker(queue, data.size)
+    # Note that we need to pass the queue as an argument to the worker
+    args = [(id, part, queue) for id, part in enumerate(parts)]
+    list_of_slices = pool.map(my_analysis.calculate_data_and_report, args)
+    checker.cancel()
+    # Join processed data back together again
+    processed_array = numpy.concatenate(list_of_slices, SPLIT_AXIS)
+    print("Processed data, output shape=%s" % str(processed_array.shape))
+def run_with_pool():
+    data = numpy.random.rand(23, 19, 17)
+    pool_process(data)
+t = timeit.timeit(run_with_pool, number=1)
+print("Data processing took %.2f seconds" % t)
+```
+%% Output
+    Input data shape=(23L, 19L, 17L)
+    We are processing 23 parts with 4 workers
+    0% complete
+    0% complete
+    13% complete
+    17% complete
+    17% complete
+    34% complete
+    34% complete
+    47% complete
+    52% complete
+    56% complete
+    69% complete
+    69% complete
+    78% complete
+    86% complete
+    95% complete
+    Processed data, output shape=(23L, 19L, 17L)
+    Data processing took 17.39 seconds
+    100% complete
+%% Cell type:markdown id: tags:
+# Summary
+%% Cell type:markdown id: tags:
+## What we've covered
+ - Limitations of threading for parallel processing in Python
+ - How to split up a simple voxel-processing task into separate chunks
+   - `numpy.split()`
+ - How to run each chunk in parallel using multiprocessing
+   - `multiprocessing.Process`
+ - How to separate the number of tasks from the number of workers
+   - `multiprocessing.Pool()`
+   - `Pool.map()`
+ - How to get output back from the workers and join it back together again
+   - `numpy.concatenate()`
+ - How to pass back progress information from our worker processes
+   - `multiprocessing.manager.Queue()`
+ - Using a threading.Timer object to monitor the queue and display updates
+%% Cell type:markdown id: tags:
+## Things I haven't covered
+Loads of stuff!
+%% Cell type:markdown id: tags:
+### Threading
+- Locking of shared data (so only one thread can use it at a time)
+- Thread-local storage (see `threading.local()`)
+- See Paul's tutorial on the PyTreat GIT for more information
+- Or see the `threading` Python documentation for full details
+%% Cell type:markdown id: tags:
+### Multiprocessing
+- Passing data *between* workers
+  - Can use `Queue` for one-way traffic
+  - Use `Pipe` for two-way communication between one worker and another
+  - May be required when your problem is not 'embarrasingly parallel'
+- Sharing memory
+  - Way to avoid copying large amounts of data
+  - Look at `multiprocessing.Array`
+  - Need to convert Numpy array into a ctypes array
+  - Shared memory has pitfalls
+  - *Don't go here unless you have aready determined that data copying is a bottleneck*
+- Running workers asynchronously
+  - So main program doesn't have to wait for them to finish
+  - Instead, a function is called every time a task is finished
+  - see `multiprocessing.apply_async()` for more information
+- Error handling
+  - Needs a bit of care - very easy to 'lose' errors
+  - Workers should catch all exceptions
+  - And should return a value to signal when a task has failed
+  - Main program decides how to deal with it
+%% Cell type:markdown id: tags:
+## Always remember
+**Python is not the best tool for every job!**
+If you are really after performance, consider implementing your algorithm in multi-threaded C/C++ and then create a Python interface.
+%% Cell type:code id: tags:
+``` python
+```