"# Multiprocessing and multithreading in Python\n",
"\n",
"## Why use multiprocessing?\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import multiprocessing\n",
"multiprocessing.cpu_count()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*Almost all CPUs these days are multi-core.*\n",
"\n",
"CPU-intensive programs will not be efficient unless they take advantage of this!\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# General plan\n",
"\n",
"Walk through a basic application of multiprocessing, hopefully relevant to the kind of work you might want to do.\n",
"\n",
"Not a comprehensive guide to Python multithreading/multiprocessing.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"## Sample application\n",
"\n",
"Assume we are doing some voxelwise image processing - i.e. running a computationally intensive calculation *independently* on each voxel in a (possibly large) image. \n",
"\n",
"*(Such problems are sometimes called 'embarrassingly parallel')*\n",
"\n",
"This is in a Python module called my_analysis. Here we simulate this by just calculating a large number of exponentials for each voxel.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# my_analysis.py\n",
"\n",
"import math\n",
"import numpy\n",
"\n",
"def calculate_voxel(val):\n",
" # 'Slow' voxelwise calculation\n",
" for i in range(30000):\n",
" b = math.exp(val)\n",
" return b\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We're going to run this on a Numpy array. `numpy.vectorize` is a convenient function to apply a function to every element of the array, but it is *not* doing anything clever - it is no different than looping over the x, y, and z co-ordinates.\n",
"\n",
"We're also giving the data an ID - this will be used later when we have multiple threads."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def calculate_data(data, id=0):\n",
" # Run 'calculate_voxel' on each voxel in data\n",
"Here's some Python code to run our analysis on a random Numpy array, and time how long it takes"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Id: 0: Processing 4096 voxels\n",
"Id: 0: Done\n",
"Data processing took 26.44 seconds\n"
]
}
],
"source": [
"import numpy\n",
"import timeit\n",
"\n",
"import my_analysis\n",
"\n",
"def run():\n",
" data = numpy.random.rand(16, 16, 16)\n",
" my_analysis.calculate_data(data)\n",
" \n",
"t = timeit.timeit(run, number=1)\n",
"print(\"Data processing took %.2f seconds\" % t)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So, it took a little while."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we watch what's going on while this runs, we can see the program is not using all of our CPU. It's only working on one core.\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## What we want\n",
"\n",
"It would be nice to split the data up into chunks and give one to each core. Then we could get through the processing 8 times as fast. \n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Multithreading attempt\n",
"\n",
"*Threads* are a way for a program to run more than one task at a time. Let's try using this on our application, using the Python `threading` module. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Splitting the data up\n",
"\n",
"We're going to need to split the data up into chunks. Numpy has a handy function `numpy.split` which slices the array up into equal portions along a specified axis:\n",
"print(\"Data processing took %.2f seconds\" % t)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The Big Problem with Python threads"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Only one thread can execute Python code at a time**\n",
"\n",
"\n",
"\n",
"This is what's really going on.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The reason is something called the **Global Interpreter Lock (GIL)**. Only one thread can have it, and you can only execute Python code when you have the GIL."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## So, does that mean Python threads are useless?\n",
"\n",
"No, not completely. They're useful for:\n",
"\n",
"- Making a user interface continue to respond while a calculation takes place in the background\n",
"- A web server handling multiple requests.\n",
" - *The GIL is not required while waiting for network connections*\n",
"- Doing calculations in parallel which are running in native (C/C++) code\n",
" - *The GIL is not required while running native code*\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### But for doing CPU-intensive Python calculations in parallel, yes Python threads are essentially useless\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Can multiprocessing help?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Differences between threads and processes\n",
"\n",
"- Threads are quicker to start up and generally require fewer resources\n",
"\n",
"- Threads share memory with the main process \n",
" - Don't need to copy your data to pass it to a thread\n",
" - Don't need to copy the output data back to the main program\n",
" \n",
"- Processes have their own memory space \n",
" - Data needs to be copied from the main program to the process\n",
" - Any output needs to be copied back\n",
" \n",
"- However, importantly for Python, *Each process has its own GIL so they can run at the same time as others*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Multiprocessing attempt\n",
"\n",
"Multiprocessing is normally more work than multithreading.\n",
"\n",
"However Python tries *very hard* to make multiprocessing as easy as multithreading.\n",
"\n",
"- `import multiprocessing` instead of `import threading`\n",
"- `multiprocessing.Process()` instead of `threading.Thread()`"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Starting worker for chunk 0\n",
"Starting worker for chunk 1\n",
"Starting worker for chunk 2\n",
"Starting worker for chunk 3\n",
"Data processing took 9.74 seconds\n"
]
}
],
"source": [
"import multiprocessing\n",
" \n",
"def multiprocess_process(data):\n",
" n_workers = 4\n",
" \n",
" # Split the data into chunks along axis 0\n",
" # We are assuming this axis is divisible by the number of workers!\n",
"print(\"Data processing took %.2f seconds\" % t)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Summary"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## What we've covered\n",
"\n",
" - Limitations of threading for parallel processing in Python\n",
" - How to split up a simple voxel-processing task into separate chunks\n",
" - `numpy.split()`\n",
" - How to run each chunk in parallel using multiprocessing\n",
" - `multiprocessing.Process`\n",
" - How to separate the number of tasks from the number of workers \n",
" - `multiprocessing.Pool()`\n",
" - `Pool.map()`\n",
" - How to get output back from the workers and join it back together again\n",
" - `numpy.concatenate()`\n",
" - How to pass back progress information from our worker processes\n",
" - `multiprocessing.manager.Queue()`\n",
" - Using a threading.Timer object to monitor the queue and display updates"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Things I haven't covered\n",
"\n",
"Loads of stuff!\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Threading\n",
"\n",
"- Locking of shared data (so only one thread can use it at a time)\n",
"- Thread-local storage (see `threading.local()`)\n",
"- See Paul's tutorial on the PyTreat GIT for more information\n",
"- Or see the `threading` Python documentation for full details"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Multiprocessing\n",
"\n",
"- Passing data *between* workers\n",
" - Can use `Queue` for one-way traffic\n",
" - Use `Pipe` for two-way communication between one worker and another\n",
" - May be required when your problem is not 'embarrasingly parallel'\n",
"- Sharing memory\n",
" - Way to avoid copying large amounts of data\n",
" - Look at `multiprocessing.Array`\n",
" - Need to convert Numpy array into a ctypes array\n",
" - Shared memory has pitfalls\n",
" - *Don't go here unless you have aready determined that data copying is a bottleneck*\n",
"- Running workers asynchronously\n",
" - So main program doesn't have to wait for them to finish\n",
" - Instead, a function is called every time a task is finished\n",
" - see `multiprocessing.apply_async()` for more information\n",
"- Error handling\n",
" - Needs a bit of care - very easy to 'lose' errors\n",
" - Workers should catch all exceptions\n",
" - And should return a value to signal when a task has failed\n",
" - Main program decides how to deal with it\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## Always remember\n",
"\n",
"**Python is not the best tool for every job!**\n",
"\n",
"If you are really after performance, consider implementing your algorithm in multi-threaded C/C++ and then create a Python interface.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.14"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
%% Cell type:markdown id: tags:

# Multiprocessing and multithreading in Python
## Why use multiprocessing?
%% Cell type:code id: tags:
``` python
importmultiprocessing
multiprocessing.cpu_count()
```
%% Output
4
%% Cell type:markdown id: tags:
*Almost all CPUs these days are multi-core.*
CPU-intensive programs will not be efficient unless they take advantage of this!
%% Cell type:markdown id: tags:
# General plan
Walk through a basic application of multiprocessing, hopefully relevant to the kind of work you might want to do.
Not a comprehensive guide to Python multithreading/multiprocessing.
%% Cell type:markdown id: tags:

## Sample application
Assume we are doing some voxelwise image processing - i.e. running a computationally intensive calculation *independently* on each voxel in a (possibly large) image.
*(Such problems are sometimes called 'embarrassingly parallel')*
This is in a Python module called my_analysis. Here we simulate this by just calculating a large number of exponentials for each voxel.
%% Cell type:code id: tags:
``` python
# my_analysis.py
importmath
importnumpy
defcalculate_voxel(val):
# 'Slow' voxelwise calculation
foriinrange(30000):
b=math.exp(val)
returnb
```
%% Cell type:markdown id: tags:
We're going to run this on a Numpy array. `numpy.vectorize` is a convenient function to apply a function to every element of the array, but it is *not* doing anything clever - it is no different than looping over the x, y, and z co-ordinates.
We're also giving the data an ID - this will be used later when we have multiple threads.
Here's some Python code to run our analysis on a random Numpy array, and time how long it takes
%% Cell type:code id: tags:
``` python
importnumpy
importtimeit
importmy_analysis
defrun():
data=numpy.random.rand(16,16,16)
my_analysis.calculate_data(data)
t=timeit.timeit(run,number=1)
print("Data processing took %.2f seconds"%t)
```
%% Output
Id: 0: Processing 4096 voxels
Id: 0: Done
Data processing took 26.44 seconds
%% Cell type:markdown id: tags:
So, it took a little while.
%% Cell type:markdown id: tags:
If we watch what's going on while this runs, we can see the program is not using all of our CPU. It's only working on one core.

%% Cell type:markdown id: tags:
## What we want
It would be nice to split the data up into chunks and give one to each core. Then we could get through the processing 8 times as fast.

%% Cell type:markdown id: tags:
# Multithreading attempt
*Threads* are a way for a program to run more than one task at a time. Let's try using this on our application, using the Python `threading` module.
%% Cell type:markdown id: tags:
## Splitting the data up
We're going to need to split the data up into chunks. Numpy has a handy function `numpy.split` which slices the array up into equal portions along a specified axis:
chunks = numpy.split(full_data, num_chunks, axis)
*The data must split up equally along this axis! We will solve this problem later*
Starting worker for part 2Id: 2: Processing 1024 voxels
Starting worker for part 3Id: 3: Processing 1024 voxels
Id: 1: DoneId: 0: Done
Id: 2: Done
Id: 3: Done
Data processing took 132.90 seconds
%% Cell type:markdown id: tags:
# The Big Problem with Python threads
%% Cell type:markdown id: tags:
**Only one thread can execute Python code at a time**

This is what's really going on.
%% Cell type:markdown id: tags:
The reason is something called the **Global Interpreter Lock (GIL)**. Only one thread can have it, and you can only execute Python code when you have the GIL.
%% Cell type:markdown id: tags:
## So, does that mean Python threads are useless?
No, not completely. They're useful for:
- Making a user interface continue to respond while a calculation takes place in the background
- A web server handling multiple requests.
-*The GIL is not required while waiting for network connections*
- Doing calculations in parallel which are running in native (C/C++) code
-*The GIL is not required while running native code*
%% Cell type:markdown id: tags:
### But for doing CPU-intensive Python calculations in parallel, yes Python threads are essentially useless
%% Cell type:markdown id: tags:
## Can multiprocessing help?
%% Cell type:markdown id: tags:
### Differences between threads and processes
- Threads are quicker to start up and generally require fewer resources
- Threads share memory with the main process
- Don't need to copy your data to pass it to a thread
- Don't need to copy the output data back to the main program
- Processes have their own memory space
- Data needs to be copied from the main program to the process
- Any output needs to be copied back
- However, importantly for Python, *Each process has its own GIL so they can run at the same time as others*
%% Cell type:markdown id: tags:
## Multiprocessing attempt
Multiprocessing is normally more work than multithreading.
However Python tries *very hard* to make multiprocessing as easy as multithreading.
-`import multiprocessing` instead of `import threading`
-`multiprocessing.Process()` instead of `threading.Thread()`
%% Cell type:code id: tags:
``` python
importmultiprocessing
defmultiprocess_process(data):
n_workers=4
# Split the data into chunks along axis 0
# We are assuming this axis is divisible by the number of workers!