"In this section we will explore how to write and/or retrieve our data from text files.\n",
"\n",
"Most of the functionality for reading/writing files and manipulating strings is available without any imports. However, you can find some additional functionality in the [`string`](https://docs.python.org/3.6/library/string.html) module.\n",
"\n",
"Most of the string functions are available as methods on string objects. This means that you can use the ipython autocomplete to check for them."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"empty_string = ''"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"empty_string. # after running the code block above, put your cursor behind the dot and press tab to get a list of methods"
"Single-line strings can be created in python using either single or double quotes"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"a_string = 'To be or not to be'\n",
"same_string = \"To be or not to be\"\n",
"print(a_string == same_string)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The main rationale for choosing between single or double quotes, is whether the string itself will contain any quotes. You can include a single quote in a string surrounded by single quotes by escaping it with the `\\` character:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"a_string = \"That's the question\"\n",
"same_string = 'That\\'s the question'\n",
"print(a_string == same_string)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"New-lines (`\\n`), tabs (`\\t`) and many other special characters are supported"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"a_string = \"This is the first line.\\nAnd here is the second.\\n\\tThe third starts with a tab.\"\n",
"print(a_string)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However, the easiest way to create multi-line strings is to use a triple quote (again single or double quotes can be used:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"multi_line_string = \"\"\"This is the first line.\n",
"And here is the second.\n",
"\\tThird line starts with a tab.\"\"\"\n",
"print(multi_line_string)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you don't want python to reintepret your `\\n`, `\\t`, etc. in your strings, you can prepend the quotes enclosing the string with an `r`. This will lead to python interpreting the following string as raw text."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"single_line_string = \"This string is not multiline.\\nEven though it contains the \\n character\"\n",
"To encourage the spread of python around the world, python 3 switched to using unicode as the default for strings and code (which is one of the main reasons for the incompatibility between python 2 and 3). This means that any unicode characters can be used in strings (or in our code):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"Δ = \"café\"\n",
"print(Δ)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Python 3 uses UTF-8 encoding by default, although you can change this in any file (see [python documentation on encoding](https://docs.python.org/3/howto/unicode.html) for more details)\n",
"\n",
"In python 2 the string object was a simple array of bytes. You can create such a byte array from your unicode string in python 3 using the encode method"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"delta = \"Δ\"\n",
"print(delta, ' in python 2 would be represented as ', delta.encode())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These byte arrays can be created directly be prepending the quotes enclosing the string with a `b`, which tells python 3 to interpret the following as a byte array:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"a_byte_array = b'\\xce\\xa9'\n",
"print('The bytes ', a_byte_array, ' become ', a_byte_array.decode(), ' with UTF-8 encoding')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Especially in code dealing with strings (e.g., reading/writing of files) many of the errors arising of running python 2 code in python 3 arise from the mixing of unicode strings with byte arrays. Decoding and/or encoding some of these objects can often fix these issues.\n",
"There are two functions to convert python objects into strings, `repr()` and `str()`.\n",
"All other functions that rely on string-representations of python objects will use one of these two (for example the `print()` function will call `str()` on the object).\n",
"\n",
"The goal of the `str()` function is to be readable, while the goal of `repr()` is to be unambiguous. For example"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"print(str(\"3\"))\n",
"print(str(3))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"While the output of both `str()` functions are very clear, we can not know whether the input was a string or an actual integer."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"print(repr(\"3\"))\n",
"print(repr(3))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the output of the `repr()` function can be directly be passed back to the python interpreter to recreate our string/integer.\n",
"Note that the string on which the `join()` method is called (`', '` in this case) is used to glue the different strings together. If you just want to concatenate the strings you can call `join()` on the empty string:"
"Using the techniques in [Combining strings](#combining-strings) we can build simple strings. For longer strings it is often useful to first write a template strings with some placeholders, where variables are later inserted. Built into python are currently 4 different ways of doing this (with many packages providing similar capabilities):\n",
"* the recommended [new-style formatting](https://docs.python.org/3.6/library/string.html#format-string-syntax).\n",
"Here we provide a single example using the first three methods, so you can recognize them in the future.\n",
"\n",
"First the old print-f style. Note that this style is invoked by using the modulo (`%`) operator on the string. Every placeholder (starting with the `%`) is then replaced by one of the values provided."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"a = 3\n",
"b = 1 / 3\n",
"\n",
"print('%.3f = %i + %.3f' % (a + b, a, b))\n",
"print('%(total).3f = %(a)i + %(b).3f' % {'a': a, 'b': b, 'total': a + b})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then the recommended new style formatting (You can find a nice tutorial [here](https://www.digitalocean.com/community/tutorials/how-to-use-string-formatters-in-python-3)). Note that this style is invoked by calling the `format()` method on the string and the placeholders are marked by the curly braces `{}`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"a = 3\n",
"b = 1 / 3\n",
"\n",
"print('{:.3f} = {} + {:.3f}'.format(a + b, a, b))\n",
"Finally the new, fancy formatted string literals (only available in python 3.6+). This new format is very similar to the recommended style, except that all placeholders are automatically evaluated in the local environment at the time the template is defined. This means that we do not have to explicitly provide the parameters (and we can evaluate the sum inside the string!), although it does mean we also can not re-use the template."
"The simplest way to extract a sub-string is to use slicing"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"a_string = 'abcdefghijklmnopqrstuvwxyz'\n",
"print(a_string[10]) # create a string containing only the 10th character\n",
"print(a_string[20:]) # create a string containing the 20th character onward\n",
"print(a_string[::-1]) # creating the reverse string"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you are not sure, where to cut into a string, you can use the `find()` method to find the first occurrence of a sub-string or `findall()` to find all occurrences."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"a_string = 'abcdefghijklmnopqrstuvwxyz'\n",
"index = a_string.find('fgh')\n",
"print(a_string[:index]) # extracts the sub-string up to the first occurence of 'fgh'\n",
"print('index for non-existent sub-string', a_string.find('cats')) # note that find returns -1 when it can not find the sub-string rather than raising an error."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Regular expressions\n",
"Regular expressions are used for looking for specific patterns in a longer string. This can be used to extract specific information from a well-formatted string or to modify a string. In python regular expressions are available in the [re](https://docs.python.org/3/library/re.html#re-syntax) module.\n",
"\n",
"A full discussion of regular expression goes far beyond this tutorial. If you are interested, have a look at [https://docs.python.org/3/howto/regex.html]\n",
"\n",
"## Exercises\n",
"### Joining/splitting strings\n",
"go from 2 column file to 2 rows\n",
"### String formatting and regular expressions\n",
"Given a template for MRI files:\n",
"s<subject_id>/<modality>_<res>mm.nii.gz\n",
"where <subject_id> is a 6-digit subject-id, <modality> is one of T1w, T2w, or PD, and <res> is the resolution of the image (up to one digits behind the dot, e.g. 1.5)\n",
"Write a function that takes the subject_id (as an integer), the modality (as a string), and the resolution (as a float) and returns the complete filename (Hint: use one of the formatting techniques mentioned in [String formatting](#string-formatting))."
"For a more difficult exercise, write a function that extracts the subject id, modality, and resolution from a filename name (using a regular expression or by using `find` and `split` to access relevant parts of the string)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def get_parameters(filename):\n",
" ...\n",
" return subject_id, modality, resolution"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.2"
},
"toc": {
"colors": {
"hover_highlight": "#DAA520",
"running_highlight": "#FF0000",
"selected_highlight": "#FFD700"
},
"moveMenuLeft": true,
"nav_menu": {
"height": "287px",
"width": "252px"
},
"navigate_menu": true,
"number_sections": true,
"sideBar": true,
"threshold": 4,
"toc_cell": false,
"toc_section_display": "block",
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 2
}
%% Cell type:markdown id: tags:
# Text input/output
In this section we will explore how to write and/or retrieve our data from text files.
Most of the functionality for reading/writing files and manipulating strings is available without any imports. However, you can find some additional functionality in the [`string`](https://docs.python.org/3.6/library/string.html) module.
Most of the string functions are available as methods on string objects. This means that you can use the ipython autocomplete to check for them.
%% Cell type:code id: tags:
``` python
empty_string=''
```
%% Cell type:code id: tags:
``` python
empty_string.# after running the code block above, put your cursor behind the dot and press tab to get a list of methods
```
%% Cell type:markdown id: tags:
<aclass="anchor"id="creating-new-strings"></a>
## Creating new strings
<aclass="anchor"id="string-syntax"></a>
### String syntax
Single-line strings can be created in python using either single or double quotes
%% Cell type:code id: tags:
``` python
a_string='To be or not to be'
same_string="To be or not to be"
print(a_string==same_string)
```
%% Cell type:markdown id: tags:
The main rationale for choosing between single or double quotes, is whether the string itself will contain any quotes. You can include a single quote in a string surrounded by single quotes by escaping it with the `\` character:
%% Cell type:code id: tags:
``` python
a_string = "That's the question"
same_string = 'That\'s the question'
print(a_string == same_string)
```
%% Cell type:markdown id: tags:
New-lines (`\n`), tabs (`\t`) and many other special characters are supported
%% Cell type:code id: tags:
``` python
a_string = "This is the first line.\nAnd here is the second.\n\tThe third starts with a tab."
print(a_string)
```
%% Cell type:markdown id: tags:
However, the easiest way to create multi-line strings is to use a triple quote (again single or double quotes can be used:
%% Cell type:code id: tags:
``` python
multi_line_string = """This is the first line.
And here is the second.
\tThird line starts with a tab."""
print(multi_line_string)
```
%% Cell type:markdown id: tags:
If you don't want python to reintepret your `\n`, `\t`, etc. in your strings, you can prepend the quotes enclosing the string with an `r`. This will lead to python interpreting the following string as raw text.
%% Cell type:code id: tags:
``` python
single_line_string = "This string is not multiline.\nEven though it contains the \n character"
print(single_line_string)
```
%% Cell type:markdown id: tags:
<a class="anchor" id="unicode-versus-bytes"></a>
#### unicode versus bytes
To encourage the spread of python around the world, python 3 switched to using unicode as the default for strings and code (which is one of the main reasons for the incompatibility between python 2 and 3). This means that any unicode characters can be used in strings (or in our code):
%% Cell type:code id: tags:
``` python
Δ = "café"
print(Δ)
```
%% Cell type:markdown id: tags:
Python 3 uses UTF-8 encoding by default, although you can change this in any file (see [python documentation on encoding](https://docs.python.org/3/howto/unicode.html) for more details)
In python 2 the string object was a simple array of bytes. You can create such a byte array from your unicode string in python 3 using the encode method
%% Cell type:code id: tags:
``` python
delta = "Δ"
print(delta, ' in python 2 would be represented as ', delta.encode())
```
%% Cell type:markdown id: tags:
These byte arrays can be created directly be prepending the quotes enclosing the string with a `b`, which tells python 3 to interpret the following as a byte array:
%% Cell type:code id: tags:
``` python
a_byte_array = b'\xce\xa9'
print('The bytes ', a_byte_array, ' become ', a_byte_array.decode(), ' with UTF-8 encoding')
```
%% Cell type:markdown id: tags:
Especially in code dealing with strings (e.g., reading/writing of files) many of the errors arising of running python 2 code in python 3 arise from the mixing of unicode strings with byte arrays. Decoding and/or encoding some of these objects can often fix these issues.
There are two functions to convert python objects into strings, `repr()` and `str()`.
All other functions that rely on string-representations of python objects will use one of these two (for example the `print()` function will call `str()` on the object).
The goal of the `str()` function is to be readable, while the goal of `repr()` is to be unambiguous. For example
%% Cell type:code id: tags:
``` python
print(str("3"))
print(str(3))
```
%% Cell type:markdown id: tags:
While the output of both `str()` functions are very clear, we can not know whether the input was a string or an actual integer.
%% Cell type:code id: tags:
``` python
print(repr("3"))
print(repr(3))
```
%% Cell type:markdown id: tags:
Note that the output of the `repr()` function can be directly be passed back to the python interpreter to recreate our string/integer.
<a class="anchor" id="combining-strings"></a>
### Combining strings
The simplest way to concatenate strings is to simply add them together:
%% Cell type:code id: tags:
``` python
a_string = "Part 1"
other_string = "Part 2"
full_string = a_string + ", " + other_string
print(full_string)
```
%% Cell type:markdown id: tags:
Given a whole sequence of strings, you can concatenate them together using the `join()` method:
Note that the string on which the `join()` method is called (`', '` in this case) is used to glue the different strings together. If you just want to concatenate the strings you can call `join()` on the empty string:
Using the techniques in [Combining strings](#combining-strings) we can build simple strings. For longer strings it is often useful to first write a template strings with some placeholders, where variables are later inserted. Built into python are currently 4 different ways of doing this (with many packages providing similar capabilities):
* the recommended [new-style formatting](https://docs.python.org/3.6/library/string.html#format-string-syntax).
Here we provide a single example using the first three methods, so you can recognize them in the future.
First the old print-f style. Note that this style is invoked by using the modulo (`%`) operator on the string. Every placeholder (starting with the `%`) is then replaced by one of the values provided.
%% Cell type:code id: tags:
``` python
a = 3
b = 1 / 3
print('%.3f = %i + %.3f' % (a + b, a, b))
print('%(total).3f = %(a)i + %(b).3f' % {'a': a, 'b': b, 'total': a + b})
```
%% Cell type:markdown id: tags:
Then the recommended new style formatting (You can find a nice tutorial [here](https://www.digitalocean.com/community/tutorials/how-to-use-string-formatters-in-python-3)). Note that this style is invoked by calling the `format()` method on the string and the placeholders are marked by the curly braces `{}`.
Finally the new, fancy formatted string literals (only available in python 3.6+). This new format is very similar to the recommended style, except that all placeholders are automatically evaluated in the local environment at the time the template is defined. This means that we do not have to explicitly provide the parameters (and we can evaluate the sum inside the string!), although it does mean we also can not re-use the template.
%% Cell type:code id: tags:
``` python
a = 3
b = 1/3
print(f'{a + b:.3f} = {a} + {b:.3f} = {a + b}')
```
%% Cell type:markdown id: tags:
<a class="anchor" id="reading-writing-files"></a>
## Reading/writing files
## Extracting sub-strings from strings
### Splitting strings
The simplest way to extract a sub-string is to use slicing
%% Cell type:code id: tags:
``` python
a_string = 'abcdefghijklmnopqrstuvwxyz'
print(a_string[10]) # create a string containing only the 10th character
print(a_string[20:]) # create a string containing the 20th character onward
print(a_string[::-1]) # creating the reverse string
```
%% Cell type:markdown id: tags:
If you are not sure, where to cut into a string, you can use the `find()` method to find the first occurrence of a sub-string or `findall()` to find all occurrences.
%% Cell type:code id: tags:
``` python
a_string = 'abcdefghijklmnopqrstuvwxyz'
index = a_string.find('fgh')
print(a_string[:index]) # extracts the sub-string up to the first occurence of 'fgh'
print('index for non-existent sub-string', a_string.find('cats')) # note that find returns -1 when it can not find the sub-string rather than raising an error.
```
%% Cell type:markdown id: tags:
### Regular expressions
Regular expressions are used for looking for specific patterns in a longer string. This can be used to extract specific information from a well-formatted string or to modify a string. In python regular expressions are available in the [re](https://docs.python.org/3/library/re.html#re-syntax) module.
A full discussion of regular expression goes far beyond this tutorial. If you are interested, have a look at [https://docs.python.org/3/howto/regex.html]
## Exercises
### Joining/splitting strings
go from 2 column file to 2 rows
### String formatting and regular expressions
Given a template for MRI files:
s<subject_id>/<modality>_<res>mm.nii.gz
where <subject_id> is a 6-digit subject-id, <modality> is one of T1w, T2w, or PD, and <res> is the resolution of the image (up to one digits behind the dot, e.g. 1.5)
Write a function that takes the subject_id (as an integer), the modality (as a string), and the resolution (as a float) and returns the complete filename (Hint: use one of the formatting techniques mentioned in [String formatting](#string-formatting)).
For a more difficult exercise, write a function that extracts the subject id, modality, and resolution from a filename name (using a regular expression or by using `find` and `split` to access relevant parts of the string)
To encourage the spread of python around the world, python 3 switched to using unicode as the default for strings and code (which is one of the main reasons for the incompatibility between python 2 and 3). This means that any unicode characters can be used in strings (or in our code):
To encourage the spread of python around the world, python 3 switched to using unicode as the default for strings and code (which is one of the main reasons for the incompatibility between python 2 and 3).
This means that each element in a string is a unicode character (using [UTF-8 encoding](https://docs.python.org/3/howto/unicode.html)), which can consist of one or more bytes.
The advantage is that any unicode characters can now be used in strings or in the code itself:
```
Δ = "café"
print(Δ)
```
Python 3 uses UTF-8 encoding by default, although you can change this in any file (see [python documentation on encoding](https://docs.python.org/3/howto/unicode.html) for more details)
In python 2 the string object was a simple array of bytes. You can create such a byte array from your unicode string in python 3 using the encode method
In python 2 each element in a string was a single byte rather than a potentially multi-byte character. You can create such a byte array from your unicode string in python 3 using the `encode()` method and converted back to a `decode()` method.
```
delta = "Δ"
print(delta, ' in python 2 would be represented as ', delta.encode())
...
...
@@ -71,7 +72,7 @@ print(delta, ' in python 2 would be represented as ', delta.encode())
These byte arrays can be created directly be prepending the quotes enclosing the string with a `b`, which tells python 3 to interpret the following as a byte array:
```
a_byte_array = b'\xce\xa9'
print('The bytes ', a_byte_array, ' become ', a_byte_array.decode(), ' with UTF-8 encoding')
print('The two bytes ', a_byte_array, ' become single unicode character (', a_byte_array.decode(), ') with UTF-8 encoding')
```
Especially in code dealing with strings (e.g., reading/writing of files) many of the errors arising of running python 2 code in python 3 arise from the mixing of unicode strings with byte arrays. Decoding and/or encoding some of these objects can often fix these issues.
"\u001b[0;32m<ipython-input-2-f7378930c369>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0mspobj\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0msp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrun\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mfsldir\u001b[0m\u001b[0;34m+\u001b[0m\u001b[0;34m'/bin/fslstats'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0moutfile\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'-V'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mstdout\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0msp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mPIPE\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 14\u001b[0m \u001b[0msout\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mspobj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstdout\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdecode\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'utf-8'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 15\u001b[0;31m \u001b[0mvol_vox\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfloat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msout\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msplit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 16\u001b[0m \u001b[0mvol_mm\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfloat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msout\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msplit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 17\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Volumes are: '\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvol_vox\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m' in voxels and '\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvol_mm\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m' in mm'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mIndexError\u001b[0m: list index out of range"
],
"output_type": "error"
}
],
"source": [
"#!/usr/bin/env fslpython\n",
"import os, sys\n",
...
...
@@ -235,9 +272,55 @@
"vol_mm = float(sout.split()[1])\n",
"print('Volumes are: ', vol_vox, ' in voxels and ', vol_mm, ' in mm')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {},
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.2"
},
"toc": {
"colors": {
"hover_highlight": "#DAA520",
"running_highlight": "#FF0000",
"selected_highlight": "#FFD700"
},
"moveMenuLeft": true,
"nav_menu": {
"height": "105px",
"width": "252px"
},
"navigate_menu": true,
"number_sections": true,
"sideBar": true,
"threshold": 4.0,
"toc_cell": false,
"toc_section_display": "block",
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 2
}
%% Cell type:markdown id: tags:
# Callable scripts in python
In this tutorial we will cover how to write simple stand-alone scripts in python that can be used as alternatives to bash scripts.
There are some code blocks within this webpage, but we recommend that you write the code in an IDE or editor instead and then run the scripts from a terminal.
## Basic script
The first line of a python script is usually:
%% Cell type:code id: tags:
```
```python
#!/usr/bin/env python
```
%% Cell type:markdown id: tags:
which invokes whichever version of python can be found by `/usr/bin/env` since python can be located in many different places.
For FSL scripts we use an alternative, to ensure that we pick up the version of python (and associated packages) that we ship with FSL. To do this we use the line:
%% Cell type:code id: tags:
```
```python
#!/usr/bin/env fslpython
```
%% Cell type:markdown id: tags:
After this line the rest of the file just uses regular python syntax, as in the other tutorials. Make sure you make the file executable - just like a bash script.
## Calling other executables
The most essential call that you need to use to replicate the way a bash script calls executables is `subprocess.run()`. A simple call looks like this:
%% Cell type:code id: tags:
```
```python
importsubprocessassp
sp.run(['ls','-la'])
```
%% Cell type:markdown id: tags:
To suppress the output do this:
%% Cell type:code id: tags:
```
```python
spobj=sp.run(['ls'],stdout=sp.PIPE)
```
%% Cell type:markdown id: tags:
To store the output do this:
%% Cell type:code id: tags:
```
```python
spobj=sp.run('ls -la'.split(),stdout=sp.PIPE)
sout=spobj.stdout.decode('utf-8')
print(sout)
```
%% Cell type:markdown id: tags:
> Note that the `decode` call in the middle line converts the string from a byte string to a normal string. In Python 3 there is a distinction between strings (sequences of characters, possibly using multiple bytes to store each character) and bytes (sequences of bytes). The world has moved on from ASCII, so in this day and age, this distinction is absolutely necessary, and Python does a fairly good job of it.
If the output is numerical then this can be extracted like this:
ifcmd:# avoids empty strings getting passed to sp.run()
print('Running command: ',cmd)
spobj=sp.run(cmd.split(),stdout=sp.PIPE)
sout.append(spobj.stdout.decode('utf-8'))
```
%% Cell type:markdown id: tags:
## Command line arguments
The simplest way of dealing with command line arguments is use the module `sys`, which gives access to an `argv` list:
%% Cell type:code id: tags:
```
```python
importsys
print(len(sys.argv))
print(sys.argv[0])
```
%% Cell type:markdown id: tags:
For more sophisticated argument parsing you can use `argparse` - good documentation and examples of this can be found on the web.
## Example script
Here is a simple bash script (it masks an image and calculates volumes - just as a random example). DO NOT execute the code blocks here within the notebook/webpage: