Skip to content
Snippets Groups Projects
Commit 05c0d1c6 authored by Paul McCarthy's avatar Paul McCarthy :mountain_bicyclist:
Browse files

Merge branch 'mnt/structuring' into 'main'

Update "structuring projects" practical to recommend `pyproject.toml`

See merge request !39
parents 98078b3c 06d0f37d
No related branches found
No related tags found
1 merge request!39Update "structuring projects" practical to recommend `pyproject.toml`
%% Cell type:markdown id: tags:
%% Cell type:markdown id:2ad24151 tags:
# Structuring a Python project
If you are writing code that you are sure will never be seen or used by
anybody else, then you can structure your project however you want, and you
can stop reading now.
If you are writing code that you are sure will never be seen or used by anybody else, then you can structure your project however you want, and you can stop reading now.
However, if you are intending to make your code available for others to use,
either as end users, or as a dependency of their own code, you will make their
lives much easier if you spend a little time organising your project
directory.
However, if you are intending to make your code available for others to use, either as end users, or as a dependency of their own code, you will make their lives much easier if you spend a little time organising your project directory.
* [Recommended project structure](#recommended-project-structure)
* [The `mypackage/` directory](#the-mypackage-directory)
* [`README`](#readme)
* [`LICENSE`](#license)
* [`requirements.txt`](#requirements-txt)
* [`setup.py`](#setup-py)
* [`pyproject.toml`](#pyproject-toml)
* [Appendix: Tests](#appendix-tests)
* [Appendix: Versioning](#appendix-versioning)
* [Include the version in your code](#include-the-version-in-your-code)
* [Deprecate, don't remove!](#deprecate-dont-remove)
* [Appendix: Cookiecutter](#appendix-cookiecutter)
* [Appendix: Project management tools](#appendix-project-management)
* [Hatch](#hatch)
* [Poetry](#poetry)
* [Cookiecutter](#cookiecutter)
Official documentation:
https://packaging.python.org/tutorials/distributing-packages/
Official documentation: https://packaging.python.org/en/latest/
<a class="anchor" id="recommended-project-structure"></a>
## Recommended project structure
A Python project directory should, at the very least, have a structure that
resembles the following:
A Python project directory should, at the very least, have a structure that resembles the following:
> ```
> myproject/
> mypackage/
> __init__.py
> mymodule.py
> README
> LICENSE
> requirements.txt
> setup.py
> myproject/
> mypackage/
> __init__.py
> mymodule.py
> README
> LICENSE
> requirements.txt
> pyproject.toml
> ```
> This example structure is in the `structuring_projects/example_project/` sub-directory - have a look through it if you like.
Another popular option is to store your source code within a `src/` directory, like so:
> ```
> myproject/
> src/
> mypackage/
> __init__.py
> mymodule.py
> README
> LICENSE
> pyproject.toml
> ```
This structure can be useful if you would like to keep your source code separated from other files in your project (e.g. tests, utility scripts, third-party libraries, etc.).
This example structure is in the `structuring_projects/example_project/`
sub-directory - have a look through it if you like.
> It is important to note that the `myproject/` and `src/` directories are **not** a part of your Python package namespace - see [below](#the-mypackage-directory) for more details.
<a class="anchor" id="the-mypackage-directory"></a>
### The `mypackage/` directory
The first thing you should do is make sure that all of your python code is
organised into a sensibly-named
[*package*](https://docs.python.org/3/tutorial/modules.html#packages). This
is important, because it greatly reduces the possibility of naming collisions
when people install your library alongside other libraries. Hands up those of
you who have ever written a file called `utils.[py|m|c|cpp]`!
The first thing you should do is make sure that all of your python code is organised into a sensibly-named [*package*](https://docs.python.org/3/tutorial/modules.html#packages). This is important, because it greatly reduces the possibility of naming collisions when people install your library alongside other libraries. Hands up those of you who have ever written a file called `utils.[py|m|c|cpp]`!
Check out the `advanced_programming/modules_and_packages.ipynb` practical for
more details on packages in Python.
When a Python package is installed into an end user's system, the package files are installed into a directory alongside all of the other packages that the user has installed. If two packages have used the same package or file names, it won't be possible to install both of those packages into the same environment. So it is important that you organise your python files into a package with a unique name, which is unlikely to collide with any other Python packages.
Check out the `advanced_programming/modules_and_packages.ipynb` practical for more details on packages in Python.
<a class="anchor" id="readme"></a>
### `README`
Every project should have a README file. This is simply a plain text file
which describes your project and how to use it. It is common and acceptable
for a README file to be written in plain text,
[reStructuredText](http://www.sphinx-doc.org/en/stable/rest.html)
(`README.rst`), or
[markdown](https://guides.github.com/features/mastering-markdown/)
(`README.md`).
Every project should have a `README` file. This is simply a plain text file which describes your project and how to use it. It is common and acceptable for a README file to be written in plain text, [reStructuredText](http://www.sphinx-doc.org/en/stable/rest.html) (`README.rst`), or [markdown](https://guides.github.com/features/mastering-markdown/) (`README.md`).
<a class="anchor" id="license"></a>
### `LICENSE`
Having a LICENSE file makes it easy for people to understand the constraints
under which your code can be used.
<a class="anchor" id="requirements-txt"></a>
### `requirements.txt`
Having a `LICENSE` file makes it easy for people to understand the constraints under which your code can be used.
This file is not strictly necessary, but is very common in Python projects.
It contains a list of the Python-based dependencies of your project, in a
standardised syntax. You can specify the exact version, or range of versions,
that your project requires. For example:
<a class="anchor" id="pyproject-toml"></a>
### `pyproject.toml`
This is the most important file (apart from your code, of course). The `pyproject.toml` file contains all of the information that is needed to turn your Python project into a package that can be published. This file is where you put information such as the package name, version number, and dependencies, along with information about how your package should be built.
> ```
> six==1.*
> numpy==1.*
> scipy>=0.18
> nibabel==2.*
> ```
> Python packages used to be built using a file called `setup.py`. `setup.py`-based packages are still very common, and it is sometimes necessary to include a `setup.py` file. But `pyproject.toml` should be sufficient for most projects.
If your project has optional dependencies, i.e. libraries which are not
critical but, if present, will allow your project to offer some extra
features, you can list them in a separate requirements file called, for
example, `requirements-extra.txt`.
`pyproject.toml` is a text file written in a format called [TOML](https://toml.io/en/), and which contains metadata describing your project. There are many ways of writing a `pyproject.toml` file - here we are just going to present one method (you can find the complete file in `structuring_projects/example_project/pyproject.toml`).
Having all your dependencies listed in a file in this way makes it easy for
others to install the dependencies needed by your project, simply by running:
A `pyproject.toml` file contains a range of different sections. The `[project]` section contains basic information about your project. This section should be fairly self-explanatory, with the exception of `dynamic = ["version"]`, which is explained below.
> ```
> pip install -r requirements.txt
> [project]
> name = "example-project"
> dynamic = ["version"]
> description = "Example Python project for PyTreat"
> readme = {file = "README.md", content-type="text/markdown"}
> license = {text = "Apache License Version 2.0"}
> requires-python = ">=3.8"
> authors = [{name = "Paul McCarthy", email = "pauldmccarthy@gmail.com"}]
> dependencies = [
> "numpy",
> "nibabel",
> "scipy"
> ]
> ```
<a class="anchor" id="setup-py"></a>
### `setup.py`
If your project provides scripts that you would like to have installed as command-line tools, you can list them inside `[project.scripts]`:
This is the most important file (apart from your code, of course). Python
projects are installed using
[`setuptools`](https://setuptools.readthedocs.io/en/latest/), which is used
internally during both the creation of, and installation of Python libraries.
> ```
> [project.scripts]
> myscript = "mypackage.mymodule:main"
> ```
The `setup.py` file in a Python project is akin to a `Makefile` in a C/C++
project. But `setup.py` is also the location where you can define project
metadata (e.g. name, author, URL, etc) in a standardised format and, if
necessary, customise aspects of the build process for your library.
Python allows you to use different "build systems" to build your package. For most packages, `setuptools` is the recommended approach, and you can specify this in the `[build-system]` section:
You generally don't need to worry about, or interact with `setuptools` at all.
With one exception - `setup.py` is a Python script, and its main job is to
call the `setuptools.setup` function, passing it information about your
project.
> ```
> [build-system]
> requires = ["setuptools"]
> build-backend = "setuptools.build_meta"
> ```
The `setup.py` for our example project might look like this:
You need to tell `setuptools` which Python files to include in the package - this can usually be accomplished with just two lines:
> ```
> #!/usr/bin/env python
>
> from setuptools import setup
> from setuptools import find_packages
>
> # Import version number from
> # the project package (see
> # the section on versioning).
> from mypackage import __version__
>
> # Read in requirements from
> # the requirements.txt file.
> with open('requirements.txt', 'rt') as f:
> requirements = [l.strip() for l in f.readlines()]
>
> # Generate a list of all of the
> # packages that are in your project.
> packages = find_packages()
>
> setup(
>
> name='Example project',
> description='Example Python project for PyTreat',
> url='https://git.fmrib.ox.ac.uk/fsl/win-pytreat/',
> author='Paul McCarthy',
> author_email='pauldmccarthy@gmail.com',
> license='Apache License Version 2.0',
>
> packages=packages,
>
> version=__version__,
>
> install_requires=requirements,
>
> classifiers=[
> 'Development Status :: 3 - Alpha',
> 'Intended Audience :: Developers',
> 'License :: OSI Approved :: Apache Software License',
> 'Programming Language :: Python :: 2.7',
> 'Programming Language :: Python :: 3.4',
> 'Programming Language :: Python :: 3.5',
> 'Programming Language :: Python :: 3.6',
> 'Topic :: Software Development :: Libraries :: Python Modules'],
> )
> [tool.setuptools.packages.find]
> include = ["mypackage*"]
> ```
Finally, you can tell `setuptools` that your project version number is defined as an attribute called `__version__`, inside the `mypackage/__init__.py` file.
> ```
> [tool.setuptools.dynamic]
> version = {attr = "mypackage.__version__"}
> ```
The `setup` function gets passed all of your project's metadata, including its
version number, depedencies, and licensing information. The `classifiers`
argument should contain a list of
[classifiers](https://pypi.python.org/pypi?%3Aaction=list_classifiers) which
are applicable to your project. Classifiers are purely for descriptive
purposes - they can be used to aid people in finding your project on
[`PyPI`](https://pypi.python.org/pypi), if you release it there.
You also have the option of defining your version number directly in the `pyproject.toml` file - this is as simple as adding `version = "<current-version>"` to the `[project]` section. See [below](#appendix-versioning) for more information on managing version numbers.
See
[here](https://packaging.python.org/tutorials/distributing-packages/#setup-args)
for more details on `setup.py` and the `setup` function.
Refer to the [Python packaging guide](https://packaging.python.org/en/latest/guides/writing-pyproject-toml/) for more details on writing your `pyproject.toml` file.
<a class="anchor" id="appendix-tests"></a>
## Appendix: Tests
There are no strict rules for where to put your tests (you have tests,
right?). There are two main conventions:
There are no strict rules for where to put your tests (you have tests, right?). There are two main conventions:
You can store your test files *inside* your package directory:
> ```
> myproject/
> mypackage/
> __init__.py
> mymodule.py
> tests/
> __init__.py
> test_mymodule.py
> ```
Or, you can store your test files *alongside* your package directory:
> ```
> myproject/
> mypackage/
> __init__.py
> mymodule.py
> tests/
> test_mymodule.py
> ```
If you want your test code to be completely independent of your project's
code, then go with the second option. However, if you would like your test
code to be distributed as part of your project (e.g. so that end users can run
them), then the first option is probably the best.
If you want your test code to be completely independent of your project's code, then go with the second option. However, if you would like your test code to be distributed as part of your project (e.g. so that end users can run them), then the first option is probably the best.
But in the end, the standard Python unit testing frameworks
([`pytest`](https://docs.pytest.org/en/latest/) and
[`unittest`](https://docs.python.org/3/library/unittest.html)) are pretty good
at finding your test functions no matter where you've hidden them, so the
choice is really up to you.
But in the end, the standard Python unit testing frameworks ([`pytest`](https://docs.pytest.org/en/latest/) and [`unittest`](https://docs.python.org/3/library/unittest.html)) are pretty good at finding your test functions no matter where you've hidden them, so the choice is really up to you.
<a class="anchor" id="appendix-versioning"></a>
## Appendix: Versioning
If you are intending to make your project available for public use (e.g. on
[PyPI](https://pypi.python.org/pypi) and/or
[conda](https://anaconda.org/anaconda/repo)), it is **very important** to
manage the version number of your project. If somebody decides to build their
software on top of your project, they are not going to be very happy with you
if you make substantial, API-breaking changes without changing your version
number in an appropriate manner.
If you are intending to make your project available for public use (e.g. on [PyPI](https://pypi.python.org/pypi) and/or [conda-forge](https://anaconda.org/conda-forge/)), it is **very important** to manage the version number of your project. If somebody decides to build their software on top of your project, they are not going to be very happy with you if you make substantial, API-breaking changes without changing your version number in an appropriate manner.
Python has [official standards](https://www.python.org/dev/peps/pep-0440/) on
what constitutes a valid version number. These standards can be quite
complicated but, in the vast majority of cases, a simple three-number
versioning scheme comprising *major*, *minor*, and *patch* release
numbers should suffice. Such a version number has the form:
Python has [official standards](https://www.python.org/dev/peps/pep-0440/) on what constitutes a valid version number. These standards can be quite complicated but, in the vast majority of cases, a simple three-number versioning scheme comprising *major*, *minor*, and *patch* release numbers should suffice. Such a version number has the form:
> ```
> major.minor.patch
> ```
For example, a version number of `1.3.2` has a _major_ release of 1, _minor_
release of 3, and a _patch_ release of 2.
For example, a version number of `1.3.2` has a _major_ release of 1, _minor_ release of 3, and a _patch_ release of 2.
If you follow some simple and rational guidelines for versioning
`your_project`, then people who use your project can, for instance, specify
that they depend on `your_project==1.*`, and be sure that their code will work
for *any* version of `your_project` with a major release of 1. Following these
simple guidelines greatly improves software interoperability, and makes
everybody (i.e. developers of other projects, and end users) much happier!
If you follow some simple and rational guidelines for versioning `your_project`, then people who use your project can, for instance, specify that they depend on `your_project==1.*`, and be sure that their code will work for *any* version of `your_project` with a major release of 1. Following these simple guidelines greatly improves software interoperability, and makes everybody (i.e. developers of other projects, and end users) much happier!
Many modern Python projects use some form of [*semantic
versioning*](https://semver.org/). Semantic versioning is simply a set of
guidelines on how to manage your version number:
Many modern Python projects use some form of [*semantic versioning*](https://semver.org/). Semantic versioning is simply a set of guidelines on how to manage your version number:
- The *major* release number should be incremented whenever you introduce any
backwards-incompatible changes. In other words, if you change your code
such that some other code which uses your code would break, you should
increment the major release number.
- The *major* release number should be incremented whenever you introduce any backwards-incompatible changes. In other words, if you change your code such that some other code which uses your code would break, you should increment the major release number.
- The *minor* release number should be incremented whenever you add any new
(backwards-compatible) features to your project.
- The *minor* release number should be incremented whenever you add any new (backwards-compatible) features to your project.
- The *patch* release number should be incremented for backwards-compatible
bug-fixes and other minor changes.
- The *patch* release number should be incremented for backwards-compatible bug-fixes and other minor changes.
If you like to automate things,
[`bumpversion`](https://github.com/peritus/bumpversion) is a simple tool that
you can use to help manage your version number.
If you like to automate things, you may want to check out [`bumpversion`](https://github.com/peritus/bumpversion) and [`versioneer`](https://github.com/python-versioneer/python-versioneer), both of which are simple tools that you can use to help manage your version number.
<a class="anchor" id="include-the-version-in-your-code"></a>
### Include the version in your code
While the version of a library is ultimately defined in `setup.py`, it is
standard practice for a Python library to contain a version string called
`__version__` in the `__init__.py` file of the top-level package. For example,
our `example_project/mypackage/__init__.py` file contains this line:
While the version of a library is ultimately defined in `pyproject.toml`, it is common for a Python library to contain a version string called `__version__` in the `__init__.py` file of the top-level package. For example, our `example_project/mypackage/__init__.py` file contains this line:
> ```
> __version__ = '0.1.0'
> ```
This makes a library's version number programmatically accessible and
queryable.
This makes a library's version number programmatically accessible and queryable.
<a class="anchor" id="deprecate-dont-remove"></a>
### Deprecate, don't remove!
If you really want to change your API, but can't bring yourself to increment
your major release number, consider
[*deprecating*](https://en.wikipedia.org/wiki/Deprecation#Software_deprecation)
the old API, and postponing its removal until you are ready for a major
release. This will allow you to change your API, but retain
backwards-compatilbiity with the old API until it can safely be removed at the
next major release.
If you really want to change your API, but can't bring yourself to increment your major release number, consider [*deprecating*](https://en.wikipedia.org/wiki/Deprecation#Software_deprecation) the old API, and postponing its removal until you are ready for a major release. This will allow you to change your API, but retain backwards-compatilbiity with the old API until it can safely be removed at the next major release.
You can use the built-in [`warnings`](https://docs.python.org/3.5/library/exceptions.html#DeprecationWarning) module to warn about uses of deprecated items. There are also some [third-party libraries](https://github.com/briancurtin/deprecation) which make it easy to mark a function, method or class as being deprecated.
<a class="anchor" id="appendix-project-management"></a>
## Appendix: Python project management tools
There are many tools available that can help you to manage your Python projects - this section mentions just a few that you might find useful. There is no single method of organising or managing your project - each of these tools does things in a slightly different way, so you may want to try each of them and choose the one that you like the best.
You can use the built-in
[`warnings`](https://docs.python.org/3.5/library/exceptions.html#DeprecationWarning)
module to warn about uses of deprecated items. There are also some
[third-party libraries](https://github.com/briancurtin/deprecation) which make
it easy to mark a function, method or class as being deprecated.
<a class="anchor" id="hatch"></a>
### Hatch
<a class="anchor" id="appendix-cookiecutter"></a>
## Appendix: Cookiecutter
https://hatch.pypa.io/
It is worth mentioning
[Cookiecutter](https://github.com/audreyr/cookiecutter), a little utility
program which you can use to generate a skeleton file/directory structure for
a new Python project.
Hatch is a "Python project manager", which you can use to manage local development of your Python projects, and also to manage Python installations, virtual environments and dependencies.
Hatch has a sub-command called `new`, which will generate a project template for you:
You need to give it a template (there are many available templates, including
for projects in languages other than Python) - a couple of useful templates
are the [minimal Python package
template](https://github.com/kragniz/cookiecutter-pypackage-minimal), and the
[full Python package
template](https://github.com/audreyr/cookiecutter-pypackage) (although the
latter is probably overkill for most).
> ```
> hatch new "My project"
> ```
This command will create a project template with the following structure:
> ```
> my-project
> ├── src
> │ └── my_project
> │ ├── __about__.py
> │ └── __init__.py
> ├── tests
> │ └── __init__.py
> ├── LICENSE.txt
> ├── README.md
> └── pyproject.toml
> ```
<a class="anchor" id="poetry"></a>
### Poetry
https://python-poetry.org/
nPoetry is an alternative to Hatch, which has commands for managing Python installations, virtual environments and dependencies. Poetry also has a `new` command which will create a project template for you, e.g.:
> ```
> poetry new my-project
> ```
will create the following template:
> ```
> my-project
> ├── pyproject.toml
> ├── README.md
> ├── my_project
> │ └── __init__.py
> └── tests
> └── __init__.py
> ```
<a class="anchor" id="cookiecutter"></a>
### Cookiecutter
Both Hatch and Poetry are sophisticated tools which can do much more than creating project templates. They are both also quite opinionated in how they expect you to work, which may not suit your needs. Therefore, it is worth mentioning [Cookiecutter](https://github.com/cookiecutter/cookiecutter), which is a simple program that you can use to generate a skeleton file/directory structure for a new Python project.
You need to give it a template (there are many available templates, including for projects in languages other than Python) - a couple of useful templates are the [minimal Python package template](https://github.com/florian-huber/minimal-python-template), and the [full Python package template](https://github.com/audreyr/cookiecutter-pypackage) (although the latter is probably overkill for most).
Here is how to create a skeleton project directory based off the minimal
Python packagetemplate:
Here is how to create a skeleton project directory based off the minimal Python package template:
> ```
> cookiecutter gh:florian-huber/minimal-python-template
> ```
Cookiecutter will then prompt you for basic information (e.g. project name, author name/email):
> ```
> [1/11] directory_name (my-python-project): example-project
> [2/11] package_name (my_python_package): mypackage
> [3/11] package_short_description (Short description of package): Example project
> [4/11] line_length (120):
> [5/11] version (0.1.0):
> [6/11] github_organization (<my-github-organization>): pauldmccarthy
> [7/11] Select license
> 1 - Apache Software License 2.0
> 2 - MIT license
> 3 - BSD license
> 4 - ISC license
> 5 - GNU General Public License v3 or later
> 6 - Not open source
> Choose from [1/2/3/4/5/6] (1): 1
> [8/11] full_name (Alice Bob): Paul McCarthy
> [9/11] email (yourname@hs-duesseldorf.de): pauldmccarthy@gmail.com
> [10/11] copyright_holder (you or others or ZDD Duesseldorf?):
> [11/11] code_of_conduct_email (pauldmccarthy@gmail.com):
> ```
And will then create a new directory containing the project skeleton:
> ```
> pip install cookiecutter
>
> # tell cookiecutter to create a directory
> # from the pypackage-minimal template
> cookiecutter https://github.com/kragniz/cookiecutter-pypackage-minimal.git
>
> # cookiecutter will then prompt you for
> # basic information (e.g. projectname,
> # author name/email), and then create a
> # new directory containing the project
> # skeleton.
> example-project
> ├── CODE_OF_CONDUCT.md
> ├── LICENSE
> ├── mypackage
> │ ├── __init__.py
> │ └── my_module.py
> ├── pyproject.toml
> ├── README.md
> └── tests
> ├── __init__.py
> └── test_my_module.py
> ```
......
This diff is collapsed.
Example project
===============
# Example project
This is an example project, used to demonstrate the basics of how to structure
......
#!/usr/bin/env python
import sys
def myfunction(a, b):
return a * b
def main():
if len(sys.argv) != 3:
print(f'Usage: myscript a b')
sys.exit(1)
a = float(sys.argv[1])
b = float(sys.argv[2])
print(myfunction(a, b))
[project]
name = "example-project"
dynamic = ["version"]
description = "Example Python project for PyTreat"
readme = {file = "README.md", content-type="text/markdown"}
license = {text = "Apache License Version 2.0"}
requires-python = ">=3.8"
authors = [{name = "Paul McCarthy", email = "pauldmccarthy@gmail.com"}]
dependencies = [
"numpy",
"nibabel",
"scipy"
]
[project.urls]
Repository = "https://git.fmrib.ox.ac.uk/fsl/win-pytreat/"
# The "mypackage.mymodule.main" function is installed
# as a command-line program called "myscript".
[project.scripts]
myscript = "mypackage.mymodule:main"
# Use setuptools to build this package
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
# Specify which Python packages to include
[tool.setuptools.packages.find]
include = ["mypackage*"]
# The version number is inside mypackage/__init__.py
[tool.setuptools.dynamic]
version = {attr = "mypackage.__version__"}
\ No newline at end of file
numpy==1.*
#!/usr/bin/env python
from setuptools import setup
from setuptools import find_packages
# Import version number from
# the project package (see
# the section on versioning).
from mypackage import __version__
# Read in requirements from
# the requirements.txt file.
with open('requirements.txt', 'rt') as f:
requirements = [l.strip() for l in f.readlines()]
# Generate a list of all of the
# packages that are in your project.
packages = find_packages()
setup(
name='Example project',
description='Example Python project for PyTreat',
url='https://git.fmrib.ox.ac.uk/fsl/pytreat-practicals-2020/',
author='Paul McCarthy',
author_email='pauldmccarthy@gmail.com',
license='Apache License Version 2.0',
packages=packages,
version=__version__,
install_requires=requirements,
classifiers=[
'Development Status :: 3 - Alpha',
'Intended Audience :: Developers',
'License :: OSI Approved :: Apache Software License',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Topic :: Software Development :: Libraries :: Python Modules'],
)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment