If you are writing code that you are sure will never be seen or used by anybody else, then you can structure your project however you want, and you can stop reading now.
If you are writing code that you are sure will never be seen or used by anybody else, then you can structure your project however you want, and you can stop reading now.
However, if you are intending to make your code available for others to use, either as end users, or as a dependency of their own code, you will make their lives much easier if you spend a little time organising your project directory.
However, if you are intending to make your code available for others to use, either as end users, or as a dependency of their own code, you will make their lives much easier if you spend a little time organising your project directory.
A Python project directory should, at the very least, have a structure that resembles the following:
A Python project directory should, at the very least, have a structure that resembles the following:
> ```
> ```
> myproject/
> myproject/
> mypackage/
> mypackage/
> __init__.py
> __init__.py
> mymodule.py
> mymodule.py
> README
> README
> LICENSE
> LICENSE
> requirements.txt
> pyproject.toml
> pyproject.toml
> ```
> ```
> This example structure is in the `structuring_projects/example_project/` sub-directory - have a look through it if you like.
> This example structure is in the `structuring_projects/example_project/` sub-directory - have a look through it if you like.
Another popular option is to store your source code within a `src/` directory, like so:
Another popular option is to store your source code within a `src/` directory, like so:
> ```
> ```
> myproject/
> myproject/
> src/
> src/
> mypackage/
> mypackage/
> __init__.py
> __init__.py
> mymodule.py
> mymodule.py
> README
> README
> LICENSE
> LICENSE
> pyproject.toml
> pyproject.toml
> ```
> ```
This structure can be useful if you would like to keep your source code separated from other files in your project (e.g. tests, utility scripts, third-party libraries, etc.).
This structure can be useful if you would like to keep your source code separated from other files in your project (e.g. tests, utility scripts, third-party libraries, etc.).
> It is important to note that the `myproject/` and `src/` directories are **not** a part of your Python package namespace - see [below](#the-mypackage-directory) for more details.
> It is important to note that the `myproject/` and `src/` directories are **not** a part of your Python package namespace - see [below](#the-mypackage-directory) for more details.
<aclass="anchor"id="the-mypackage-directory"></a>
<aclass="anchor"id="the-mypackage-directory"></a>
### The `mypackage/` directory
### The `mypackage/` directory
The first thing you should do is make sure that all of your python code is organised into a sensibly-named [*package*](https://docs.python.org/3/tutorial/modules.html#packages). This is important, because it greatly reduces the possibility of naming collisions when people install your library alongside other libraries. Hands up those of you who have ever written a file called `utils.[py|m|c|cpp]`!
The first thing you should do is make sure that all of your python code is organised into a sensibly-named [*package*](https://docs.python.org/3/tutorial/modules.html#packages). This is important, because it greatly reduces the possibility of naming collisions when people install your library alongside other libraries. Hands up those of you who have ever written a file called `utils.[py|m|c|cpp]`!
When a Python package is installed into an end user's system, the package files are installed into a directory alongside all of the other packages that the user has installed. If two packages have used the same package or file names, it won't be possible to install both of those packages into the same environment. So it is important that you organise your python files into a package with a unique name, which is unlikely to collide with any other Python packages.
When a Python package is installed into an end user's system, the package files are installed into a directory alongside all of the other packages that the user has installed. If two packages have used the same package or file names, it won't be possible to install both of those packages into the same environment. So it is important that you organise your python files into a package with a unique name, which is unlikely to collide with any other Python packages.
Check out the `advanced_programming/modules_and_packages.ipynb` practical for more details on packages in Python.
Check out the `advanced_programming/modules_and_packages.ipynb` practical for more details on packages in Python.
<aclass="anchor"id="readme"></a>
<aclass="anchor"id="readme"></a>
### `README`
### `README`
Every project should have a `README` file. This is simply a plain text file which describes your project and how to use it. It is common and acceptable for a README file to be written in plain text, [reStructuredText](http://www.sphinx-doc.org/en/stable/rest.html)(`README.rst`), or [markdown](https://guides.github.com/features/mastering-markdown/)(`README.md`).
Every project should have a `README` file. This is simply a plain text file which describes your project and how to use it. It is common and acceptable for a README file to be written in plain text, [reStructuredText](http://www.sphinx-doc.org/en/stable/rest.html)(`README.rst`), or [markdown](https://guides.github.com/features/mastering-markdown/)(`README.md`).
<aclass="anchor"id="license"></a>
<aclass="anchor"id="license"></a>
### `LICENSE`
### `LICENSE`
Having a `LICENSE` file makes it easy for people to understand the constraints under which your code can be used.
Having a `LICENSE` file makes it easy for people to understand the constraints under which your code can be used.
<aclass="anchor"id="pyproject-toml"></a>
<aclass="anchor"id="pyproject-toml"></a>
### `pyproject.toml`
### `pyproject.toml`
This is the most important file (apart from your code, of course). The `pyproject.toml` file contains all of the information that is needed to turn your Python project into a package that can be published. This file is where you put information such as the package name, version number, and dependencies, along with information about how your package should be built.
This is the most important file (apart from your code, of course). The `pyproject.toml` file contains all of the information that is needed to turn your Python project into a package that can be published. This file is where you put information such as the package name, version number, and dependencies, along with information about how your package should be built.
> Python packages used to be built using a file called `setup.py`. `setup.py`-based packages are still very common, and it is sometimes necessary to include a `setup.py` file. But `pyproject.toml` should be sufficient for most projects.
> Python packages used to be built using a file called `setup.py`. `setup.py`-based packages are still very common, and it is sometimes necessary to include a `setup.py` file. But `pyproject.toml` should be sufficient for most projects.
`pyproject.toml` is a text file written in a format called [TOML](https://toml.io/en/), and which contains metadata describing your project. There are many ways of writing a `pyproject.toml` file - here we are just going to present one method (you can find the complete file in `structuring_projects/example_project/pyproject.toml`).
`pyproject.toml` is a text file written in a format called [TOML](https://toml.io/en/), and which contains metadata describing your project. There are many ways of writing a `pyproject.toml` file - here we are just going to present one method (you can find the complete file in `structuring_projects/example_project/pyproject.toml`).
A `pyproject.toml` file contains a range of different sections. The `[project]` section contains basic information about your project. This section should be fairly self-explanatory, with the exception of `dynamic = ["version"]`, which is explained below.
A `pyproject.toml` file contains a range of different sections. The `[project]` section contains basic information about your project. This section should be fairly self-explanatory, with the exception of `dynamic = ["version"]`, which is explained below.
> ```
> ```
> [project]
> [project]
> name = "example-project"
> name = "example-project"
> dynamic = ["version"]
> dynamic = ["version"]
> description = "Example Python project for PyTreat"
> description = "Example Python project for PyTreat"
If your project provides scripts that you would like to have installed as command-line tools, you can list them inside `[project.scripts]`:
If your project provides scripts that you would like to have installed as command-line tools, you can list them inside `[project.scripts]`:
> ```
> ```
> [project.scripts]
> [project.scripts]
> myscript = "mypackage.mymodule:main"
> myscript = "mypackage.mymodule:main"
> ```
> ```
Python allows you to use different "build systems" to build your package. For most packages, `setuptools` is the recommended approach, and you can specify this in the `[build-system]` section:
Python allows you to use different "build systems" to build your package. For most packages, `setuptools` is the recommended approach, and you can specify this in the `[build-system]` section:
> ```
> ```
> [build-system]
> [build-system]
> requires = ["setuptools"]
> requires = ["setuptools"]
> build-backend = "setuptools.build_meta"
> build-backend = "setuptools.build_meta"
> ```
> ```
You need to tell `setuptools` which Python files to include in the package - this can usually be accomplished with just two lines:
You need to tell `setuptools` which Python files to include in the package - this can usually be accomplished with just two lines:
> ```
> ```
> [tool.setuptools.packages.find]
> [tool.setuptools.packages.find]
> include = ["mypackage*"]
> include = ["mypackage*"]
> ```
> ```
Finally, you can tell `setuptools` that your project version number is defined as an attribute called `__version__`, inside the `mypackage/__init__.py` file.
Finally, you can tell `setuptools` that your project version number is defined as an attribute called `__version__`, inside the `mypackage/__init__.py` file.
> ```
> ```
> [tool.setuptools.dynamic]
> [tool.setuptools.dynamic]
> version = {attr = "mypackage.__version__"}
> version = {attr = "mypackage.__version__"}
> ```
> ```
You also have the option of defining your version number directly in the `pyproject.toml` file - this is as simple as adding `version = "<current-version>"` to the `[project]` section. See [below](#appendix-versioning) for more information on managing version numbers.
You also have the option of defining your version number directly in the `pyproject.toml` file - this is as simple as adding `version = "<current-version>"` to the `[project]` section. See [below](#appendix-versioning) for more information on managing version numbers.
Refer to the [Python packaging guide](https://packaging.python.org/en/latest/guides/writing-pyproject-toml/) for more details on writing your `pyproject.toml` file.
Refer to the [Python packaging guide](https://packaging.python.org/en/latest/guides/writing-pyproject-toml/) for more details on writing your `pyproject.toml` file.
<aclass="anchor"id="appendix-tests"></a>
<aclass="anchor"id="appendix-tests"></a>
## Appendix: Tests
## Appendix: Tests
There are no strict rules for where to put your tests (you have tests, right?). There are two main conventions:
There are no strict rules for where to put your tests (you have tests, right?). There are two main conventions:
You can store your test files *inside* your package directory:
You can store your test files *inside* your package directory:
> ```
> ```
> myproject/
> myproject/
> mypackage/
> mypackage/
> __init__.py
> __init__.py
> mymodule.py
> mymodule.py
> tests/
> tests/
> __init__.py
> __init__.py
> test_mymodule.py
> test_mymodule.py
> ```
> ```
Or, you can store your test files *alongside* your package directory:
Or, you can store your test files *alongside* your package directory:
> ```
> ```
> myproject/
> myproject/
> mypackage/
> mypackage/
> __init__.py
> __init__.py
> mymodule.py
> mymodule.py
> tests/
> tests/
> test_mymodule.py
> test_mymodule.py
> ```
> ```
If you want your test code to be completely independent of your project's code, then go with the second option. However, if you would like your test code to be distributed as part of your project (e.g. so that end users can run them), then the first option is probably the best.
If you want your test code to be completely independent of your project's code, then go with the second option. However, if you would like your test code to be distributed as part of your project (e.g. so that end users can run them), then the first option is probably the best.
But in the end, the standard Python unit testing frameworks ([`pytest`](https://docs.pytest.org/en/latest/) and [`unittest`](https://docs.python.org/3/library/unittest.html)) are pretty good at finding your test functions no matter where you've hidden them, so the choice is really up to you.
But in the end, the standard Python unit testing frameworks ([`pytest`](https://docs.pytest.org/en/latest/) and [`unittest`](https://docs.python.org/3/library/unittest.html)) are pretty good at finding your test functions no matter where you've hidden them, so the choice is really up to you.
<aclass="anchor"id="appendix-versioning"></a>
<aclass="anchor"id="appendix-versioning"></a>
## Appendix: Versioning
## Appendix: Versioning
If you are intending to make your project available for public use (e.g. on [PyPI](https://pypi.python.org/pypi) and/or [conda-forge](https://anaconda.org/conda-forge/)), it is **very important** to manage the version number of your project. If somebody decides to build their software on top of your project, they are not going to be very happy with you if you make substantial, API-breaking changes without changing your version number in an appropriate manner.
If you are intending to make your project available for public use (e.g. on [PyPI](https://pypi.python.org/pypi) and/or [conda-forge](https://anaconda.org/conda-forge/)), it is **very important** to manage the version number of your project. If somebody decides to build their software on top of your project, they are not going to be very happy with you if you make substantial, API-breaking changes without changing your version number in an appropriate manner.
Python has [official standards](https://www.python.org/dev/peps/pep-0440/) on what constitutes a valid version number. These standards can be quite complicated but, in the vast majority of cases, a simple three-number versioning scheme comprising *major*, *minor*, and *patch* release numbers should suffice. Such a version number has the form:
Python has [official standards](https://www.python.org/dev/peps/pep-0440/) on what constitutes a valid version number. These standards can be quite complicated but, in the vast majority of cases, a simple three-number versioning scheme comprising *major*, *minor*, and *patch* release numbers should suffice. Such a version number has the form:
> ```
> ```
> major.minor.patch
> major.minor.patch
> ```
> ```
For example, a version number of `1.3.2` has a _major_ release of 1, _minor_ release of 3, and a _patch_ release of 2.
For example, a version number of `1.3.2` has a _major_ release of 1, _minor_ release of 3, and a _patch_ release of 2.
If you follow some simple and rational guidelines for versioning `your_project`, then people who use your project can, for instance, specify that they depend on `your_project==1.*`, and be sure that their code will work for *any* version of `your_project` with a major release of 1. Following these simple guidelines greatly improves software interoperability, and makes everybody (i.e. developers of other projects, and end users) much happier!
If you follow some simple and rational guidelines for versioning `your_project`, then people who use your project can, for instance, specify that they depend on `your_project==1.*`, and be sure that their code will work for *any* version of `your_project` with a major release of 1. Following these simple guidelines greatly improves software interoperability, and makes everybody (i.e. developers of other projects, and end users) much happier!
Many modern Python projects use some form of [*semantic versioning*](https://semver.org/). Semantic versioning is simply a set of guidelines on how to manage your version number:
Many modern Python projects use some form of [*semantic versioning*](https://semver.org/). Semantic versioning is simply a set of guidelines on how to manage your version number:
- The *major* release number should be incremented whenever you introduce any backwards-incompatible changes. In other words, if you change your code such that some other code which uses your code would break, you should increment the major release number.
- The *major* release number should be incremented whenever you introduce any backwards-incompatible changes. In other words, if you change your code such that some other code which uses your code would break, you should increment the major release number.
- The *minor* release number should be incremented whenever you add any new (backwards-compatible) features to your project.
- The *minor* release number should be incremented whenever you add any new (backwards-compatible) features to your project.
- The *patch* release number should be incremented for backwards-compatible bug-fixes and other minor changes.
- The *patch* release number should be incremented for backwards-compatible bug-fixes and other minor changes.
If you like to automate things, you may want to check out [`bumpversion`](https://github.com/peritus/bumpversion) and [`versioneer`](https://github.com/python-versioneer/python-versioneer), both of which are simple tools that you can use to help manage your version number.
If you like to automate things, you may want to check out [`bumpversion`](https://github.com/peritus/bumpversion) and [`versioneer`](https://github.com/python-versioneer/python-versioneer), both of which are simple tools that you can use to help manage your version number.
While the version of a library is ultimately defined in `pyproject.toml`, it is common for a Python library to contain a version string called `__version__` in the `__init__.py` file of the top-level package. For example, our `example_project/mypackage/__init__.py` file contains this line:
While the version of a library is ultimately defined in `pyproject.toml`, it is common for a Python library to contain a version string called `__version__` in the `__init__.py` file of the top-level package. For example, our `example_project/mypackage/__init__.py` file contains this line:
> ```
> ```
> __version__ = '0.1.0'
> __version__ = '0.1.0'
> ```
> ```
This makes a library's version number programmatically accessible and queryable.
This makes a library's version number programmatically accessible and queryable.
<aclass="anchor"id="deprecate-dont-remove"></a>
<aclass="anchor"id="deprecate-dont-remove"></a>
### Deprecate, don't remove!
### Deprecate, don't remove!
If you really want to change your API, but can't bring yourself to increment your major release number, consider [*deprecating*](https://en.wikipedia.org/wiki/Deprecation#Software_deprecation) the old API, and postponing its removal until you are ready for a major release. This will allow you to change your API, but retain backwards-compatilbiity with the old API until it can safely be removed at the next major release.
If you really want to change your API, but can't bring yourself to increment your major release number, consider [*deprecating*](https://en.wikipedia.org/wiki/Deprecation#Software_deprecation) the old API, and postponing its removal until you are ready for a major release. This will allow you to change your API, but retain backwards-compatilbiity with the old API until it can safely be removed at the next major release.
You can use the built-in [`warnings`](https://docs.python.org/3.5/library/exceptions.html#DeprecationWarning) module to warn about uses of deprecated items. There are also some [third-party libraries](https://github.com/briancurtin/deprecation) which make it easy to mark a function, method or class as being deprecated.
You can use the built-in [`warnings`](https://docs.python.org/3.5/library/exceptions.html#DeprecationWarning) module to warn about uses of deprecated items. There are also some [third-party libraries](https://github.com/briancurtin/deprecation) which make it easy to mark a function, method or class as being deprecated.
There are many tools available that can help you to manage your Python projects - this section mentions just a few that you might find useful. There is no single method of organising or managing your project - each of these tools does things in a slightly different way, so you may want to try each of them and choose the one that you like the best.
There are many tools available that can help you to manage your Python projects - this section mentions just a few that you might find useful. There is no single method of organising or managing your project - each of these tools does things in a slightly different way, so you may want to try each of them and choose the one that you like the best.
<aclass="anchor"id="hatch"></a>
<aclass="anchor"id="hatch"></a>
### Hatch
### Hatch
https://hatch.pypa.io/
https://hatch.pypa.io/
Hatch is a "Python project manager", which you can use to manage local development of your Python projects, and also to manage Python installations, virtual environments and dependencies.
Hatch is a "Python project manager", which you can use to manage local development of your Python projects, and also to manage Python installations, virtual environments and dependencies.
Hatch has a sub-command called `new`, which will generate a project template for you:
Hatch has a sub-command called `new`, which will generate a project template for you:
> ```
> ```
> hatch new "My project"
> hatch new "My project"
> ```
> ```
This command will create a project template with the following structure:
This command will create a project template with the following structure:
> ```
> ```
> my-project
> my-project
> ├── src
> ├── src
> │ └── my_project
> │ └── my_project
> │ ├── __about__.py
> │ ├── __about__.py
> │ └── __init__.py
> │ └── __init__.py
> ├── tests
> ├── tests
> │ └── __init__.py
> │ └── __init__.py
> ├── LICENSE.txt
> ├── LICENSE.txt
> ├── README.md
> ├── README.md
> └── pyproject.toml
> └── pyproject.toml
> ```
> ```
<aclass="anchor"id="poetry"></a>
<aclass="anchor"id="poetry"></a>
### Poetry
### Poetry
https://python-poetry.org/
https://python-poetry.org/
nPoetry is an alternative to Hatch, which has commands for managing Python installations, virtual environments and dependencies. Poetry also has a `new` command which will create a project template for you, e.g.:
nPoetry is an alternative to Hatch, which has commands for managing Python installations, virtual environments and dependencies. Poetry also has a `new` command which will create a project template for you, e.g.:
> ```
> ```
> poetry new my-project
> poetry new my-project
> ```
> ```
will create the following template:
will create the following template:
> ```
> ```
> my-project
> my-project
> ├── pyproject.toml
> ├── pyproject.toml
> ├── README.md
> ├── README.md
> ├── my_project
> ├── my_project
> │ └── __init__.py
> │ └── __init__.py
> └── tests
> └── tests
> └── __init__.py
> └── __init__.py
> ```
> ```
<aclass="anchor"id="cookiecutter"></a>
<aclass="anchor"id="cookiecutter"></a>
### Cookiecutter
### Cookiecutter
Both Hatch and Poetry are sophisticated tools which can do much more than creating project templates. They are both also quite opinionated in how they expect you to work, which may not suit your needs. Therefore, it is worth mentioning [Cookiecutter](https://github.com/cookiecutter/cookiecutter), which is a simple program that you can use to generate a skeleton file/directory structure for a new Python project.
Both Hatch and Poetry are sophisticated tools which can do much more than creating project templates. They are both also quite opinionated in how they expect you to work, which may not suit your needs. Therefore, it is worth mentioning [Cookiecutter](https://github.com/cookiecutter/cookiecutter), which is a simple program that you can use to generate a skeleton file/directory structure for a new Python project.
You need to give it a template (there are many available templates, including for projects in languages other than Python) - a couple of useful templates are the [minimal Python package template](https://github.com/florian-huber/minimal-python-template), and the [full Python package template](https://github.com/audreyr/cookiecutter-pypackage)(although the latter is probably overkill for most).
You need to give it a template (there are many available templates, including for projects in languages other than Python) - a couple of useful templates are the [minimal Python package template](https://github.com/florian-huber/minimal-python-template), and the [full Python package template](https://github.com/audreyr/cookiecutter-pypackage)(although the latter is probably overkill for most).
Here is how to create a skeleton project directory based off the minimal Python package template:
Here is how to create a skeleton project directory based off the minimal Python package template: