Python project management with Poetry and Tox
Create modern Python projects with Poetry and Tox. File structure, testing, distribution, and dependency management.
Published: Updated:Python is a very popular and widely used programming language. It is quite versatile and easy to learn, which is why it is used in a very broad range of applications – from simple scripts and CLI tools to IoT, web development, AI and machine learning, and scientific computing. Python has been around since 1991, which feels like forever given how much the world of technology has changed. This has its pros and cons. One the one hand, Python has one of the largest ecosystems out there, with almost 500,000 packages on PyPi (as of time of this writing). On the other hand, some of the design decisions made at the beginning of its development show that it is quite an old language.
One of the largest (if not the largest) pain point of all Python developers is the approach to dependency
management and packaging. Anyone I talk to describes its current state as: it’s a mess
. The reason for
this is the myriad of tools that
exist to help solve some aspect of
the problem, with no established preferred solution, despite the existence of the so-called Python Packaging
Authority (PyPa).
Introduction
Here I describe one of the ways one can set up a Python project, using a combination of the tools I have personally used at work. The setup I describe here has worked quite well for us so far, modulo some minor gotchas to be aware of (see below).
As an example, I set up a command-line tool that accepts a textual input and a set of flags to display some statistics about the input, such as: number of characters, number of words, and number of lines. This is far from a complex business application, but should be enough for the demonstration purposes.
Preparation
Install Python
First things first. Since we are developing a Python application, you should have it installed on your system.
If you have a UNIX-based operating system you should already have it. If not I suggest to use your package
manager to install it (dnf
on Fedora, apt
on Debian-based distros, homebrew
on MacOS, chocolatey
or
Microsoft Store on Windows).
The official language page suggests to download and install a binary from the website, but I do not recommend it since you will have to manually update it. Using a package manager instead does it for you.
Install pipx
The next tool I recommend to have if you plan working with Python seriously is pipx. It may not be immediately obvious why this is needed, but let me explain. Some of the tools (often command-line) you will use to set up and manage a Python project are written in Python. Since every reasonably large Python project has a lot of dependencies (both direct and indirect), you do not want them to pollute your global Python environment. To isolate every such Python-based executable and only expose the binary to the user you need a tool like pipx. Every application installed with pipx lives in its own isolated environment. Having said that, it is alright to install pipx itself into the global Python environment.
python3 -m pip install --user pipx
python3 -m pipx ensurepath
Note that Python 2 has been deprecated since January 1, 2020 and the command to invoke Python interpreter may
be python
rather than python3
on your system. You can always check the version by executing python --version
. If you want to use a particular version explicitly, use python3.x
(where x
is an integer).
Install Poetry
Next, we can leverage pipx and install Poetry – a very versatile tool for dependency management and packaging.
pipx install poetry
Poetry is one of the most popular and actively maintained tools, but the biggest criticism is its deviation from PEP 621 and PEP 508 specifications, in particular specifying project dependencies. This does not impact user experience in any way most of the time, so you should not worry about it. There are other tools that solve the same problem and are compliant with the specifications, such as Hatch, but they come with their own issues and are less actively maintained.
Install Tox
Most of the Python projects I worked on required a task runner. It is a tool that provides a unified interface to run tests, make a project release, publish the release package, etc. As usual, there are multiple tools that exist in the ecosystem, but one of the most used is Tox. We can leverage pipx again to install it.
pipx install tox
Install Git
Finally, I highly recommend using a version control system, like Git, on all of your
software projects. It is technically not required to set up a Python project, but it is so ubiquitous that
explaining the benefits of using it seems
unnecessary. Use your package manager to install Git. For example, with dnf
you would run:
sudo dnf install git
Sample Project: Text Stats
Let us go step-by-step with setting up a Python project for the text stats command-line tool mentioned above. The final version of the source code is available on GitHub, if you are interested to have a look and experiment yourself.
Project Initialization
Poetry helps with bootstrapping a fresh Python project. All you have to do is run a command in your terminal.
poetry new --src text-stats
Here, the --src
flag is used to initialize a project structure where all source code resides in a src
folder (de-facto default in most projects). You should see the following project structure.
text-stats/
├── pyproject.toml
├── README.md
├── src
│ └── text_stats
│ └── __init__.py
└── tests
└── __init__.py
All the metadata describing the project, including dependencies, is specified inside the pyproject.toml
file. The tests
folder is the default location for source code tests.
Version Control
Change directory to the newly created project and initialize a Git repository.
cd text-stats
git init .
git add .
git commit -m "Initial commit"
Before we move on, I recommend to set up a .gitignore
file to avoid checking in temporary files into the
version control system. An easy way to generate the initial content of your .gitignore
file is by using a
web generator. Go to the page, type python
in the input
field and click on the “Create” button. Copy the content of the page and save it in the root of your project
under the name .gitignore
. Finally, commit the change.
git add .gitignore
git commit -m "Add .gitignore"
Setup Tox
Next, we will bootstrap a configuration file for the task runner – Tox. It should already be installed on your machine. So, run the following command in the root of your project:
tox quickstart .
This will generate a file called tox.ini
that looks similar to this:
[tox]
env_list =
py311
minversion = 4.11.3
[testenv]
description = run the tests with pytest
package = wheel
wheel_build_env = .pkg
deps =
pytest>=6
commands =
pytest {tty:--color=yes} {posargs}
Note that it already has a default task set up that runs unit tests with the tool called pytest. I show how to use it later in the post, so bear with me. Now, you may go on and commit the generated file.
git add tox.ini
git commit -m "Generate tox.ini file"
Setup Linter, Formatter, Type Checker, and LSP Configuration
Note that we have gone pretty far without writing a single line of Python code yet. We are getting there, but we should prepare our development environment to ensure consistent code style and default quality checks, regardless how many people are going to collaborate on the project.
It is important to set up and agree on the configuration of these tools as early as possible to remove the mental effort of thinking about how to format the code or how to avoid syntax errors. So, let’s get started. The tools we need are: linter, formatter, type checker, and (optionally, but very recommended) LSP server.
First, let us setup flake8 – the de-facto standard linter for
Python. There are other, less mature, projects out there, like Ruff, that I
may cover in another post. Unlike other tools, at the time of writing flake8 does not
support configuration via pyproject.toml
, but it can use
tox.ini
instead. Open your tox.ini
file and append the following section at the end:
[flake8]
max-line-length = 120
extend-ignore = E203, W503
exclude =
.tox,
.venv,
build,
dist,
.eggs
You can read about the meaning and many more options in the official flake8 documentation, but we essentially set it up to allow source code lines with less than 120 characters (a limit of 88 is often used, but I find it too restrictive).
Next, let us set up code formatters. There are two tools we will use for code formatting:
black and isort. Both can be
configured via pyproject.toml
. Open your pyproject.toml
file and append the following at the end:
[tool.black]
line-length = 120
[tool.isort]
profile = "black"
src_paths = ["src", "tests"]
The reason we need two tools is that isort is used for formatting the import statements, while black is used to format the rest of the code. It seems strange to make the split, but code formatting is only simple on the surface. If you are familiar with Python import statements, you may know that arbitrarily shuffling the import statements may change the behavior of your Python module. So, sorting imports may not be a mere cosmetic change. Again, the line-length limit is set to 120 characters, in agreement with the linter.
Next, let us configure the type checker. Python is a dynamically typed language, which is both its strength
and its weakness. To mitigate the risk of incorrectly using data types, Python ecosystem has a tool called
mypy. It is a static code analysis tool that leverages type annotations to reason
about the data types used in the code. Mypy is also configurable via pyproject.toml
, append the following at
the end:
[tool.mypy]
strict = true
There are many more options you can set up, which you may find in the documentation.
Finally, let us configure pyright, a tool that is developed by
Microsoft. It is called a type checker in the official repository, but it implements what is known as the
Language Server Protocol (LSP). The latter is a specification developed and maintained by Microsoft that
describes a generic way to develop language servers – software that powers common IDE features like auto
completion, go to definition, or documentation on hover. Pyright can also be configured in pyproject.toml
,
append the following at the end:
[tool.pyright]
typeCheckingMode = "basic"
Check the documentation to see the full list of options. Commit you changes and proceed.
git add -u
git commit -m "Configure flake8, black, isort, mypy, and pyright"
Note that I do not describe how to install and hook up the tools mentioned above to your preferred code editor or IDE. This process depends on the editor of choice. In case of Visual Studio Code, this is powered by plugins that are straightforward to install. I am personally a Neovim user and may describe this process in more detail in a separate post.
Add Text Stats Module
Now it is time to implement the business logic of the command-line tool we are building. You may use test-driven development (TDD) or write code first and tests later, it is up to you. For simplicity of the explanation flow, let’s assume that you start with the code. Activate the virtual environment for your project with Poetry and install all dependencies.
poetry shell
poetry install
The latter command will also install the project you are working on as editable allowing you to quickly iterate on the changes to the source code and tests. Now, create a new module with the following content:
# text-stats/src/text_stats/stats.py
import re
def count_characters(input: str) -> int:
"""Count charactes in the input string (including whitespace and punctuation)"""
return len(input)
def count_words(input: str) -> int:
"""Count words in the input string"""
words = list(filter(lambda w: len(w) > 0, re.split(r"\W+", input)))
return len(words)
def count_lines(input: str) -> int:
"""Count lines in the input string"""
return len(input.splitlines())
This module implements the logic for counting the number of characters, words, and lines in the input string.
Our goal is to expose this functionality via a command-line interface to a user. Before we do that it is, of
course, a good practice to make sure the code is working as expected with the help of unit tests. Create a new
module in the tests
folder.
# text-stats/tests/tests_stats.py
import pytest
from text_stats.stats import count_characters, count_lines, count_words
@pytest.fixture
def input_special_symbols():
return "$@&+-\n\n\t;:, "
@pytest.fixture
def input_one_word():
return "abcd_123"
@pytest.fixture
def input_multiline():
return "Hi, how are you?\nI am good, and you?"
# -----------------------------
# Test count_characters
# -----------------------------
def test_count_characters_empty_string():
assert count_characters("") == 0
def test_count_characters_one_word(input_one_word: str):
assert count_characters(input_one_word) == 8
def test_count_characters_special_symbols(input_special_symbols: str):
assert count_characters(input_special_symbols) == 12
def test_count_characters_multiline(input_multiline: str):
assert count_characters(input_multiline) == 36
# -----------------------------
# Test count_words
# -----------------------------
def test_count_words_empty_string():
assert count_words("") == 0
def test_count_words_one_word(input_one_word: str):
assert count_words(input_one_word) == 1
def test_count_words_special_symbols(input_special_symbols: str):
assert count_words(input_special_symbols) == 0
def test_count_words_multiline(input_multiline: str):
assert count_words(input_multiline) == 9
# -----------------------------
# Test count_lines
# -----------------------------
def test_count_lines_empty_string():
assert count_lines("") == 0
def test_count_lines_one_word(input_one_word: str):
assert count_lines(input_one_word) == 1
def test_count_lines_special_symbols(input_special_symbols: str):
assert count_lines(input_special_symbols) == 3
def test_count_lines_multiline(input_multiline: str):
assert count_lines(input_multiline) == 2
Note how we use pytest to set up fixtures. If you have the python language server set up, you may see an error in your IDE that pytest import could not be resolved. Since this package is only needed during development it should not be a regular dependency, but rather a development-time only. We can use Poetry again to fix it.
poetry add --group dev pytest
This will add pytest to a group of dependencies called dev
that does get installed with poetry install
, but is not part of the production release. Now, use Tox to make sure that all test pass.
tox
You should see the output, part of which looks similar to the following:
collected 12 items
tests/test_stats.py ............ [100%]
===================================== 12 passed in 0.01s =====================================
Commit the latest changes before proceeding.
git add .
git commit -m "Add stats module"
Create Command-Line Interface
The business logic module is finished and we have to add the command-line interface to our application. Create
a new file in the src
folder with the following content:
# text-stats/src/text_stats/cli.py
from argparse import ArgumentParser
from text_stats.stats import count_characters, count_lines, count_words
def run():
parser = ArgumentParser(description="Count character, words, lines in a text file")
parser.add_argument("input", help="Text input")
parser.add_argument("-c", "--characters", action="store_true", help="Count characters")
parser.add_argument("-w", "--words", action="store_true", help="Count words")
parser.add_argument("-l", "--lines", action="store_true", help="Count lines")
args = parser.parse_args()
if args.characters:
print(f"Number of characters: {count_characters(args.input)}")
if args.words:
print(f"Number of words: {count_words(args.input)}")
if args.lines:
print(f"Number of lines: {count_lines(args.input)}")
if __name__ == "__main__":
run()
This module use the built-in Python module argparse
to parse the input arguments and call the appropriate
business logic. You can immediately check that it works by running this module.
python3 src/text_stats/cli.py -c -w -l "Hello, World!"
This should produce an output like this:
Number of characters: 13
Number of words: 2
Number of lines: 1
Wouldn’t it be nice to be able to call this using a custom-named command? Poetry allows to do that with
scripts. All you have to do is add a section to your pyproject.toml
like this:
[tool.poetry.scripts]
text-stats = "text_stats.cli:run"
It specifies a path to a function to call inside the installed version of our project. You could say, “Wait a
minute. Isn’t our project called text-stats
, with a dash?“. Yes, but the name of the package installed is
text_stats
, see the following line in pyproject.toml
:
[tool.poetry]
...
packages = [{include = "text_stats", from = "src"}]
This also tells you that a Poetry project may contain multiple packages. This would be like a monorepo setup where you a common repository for a tightly related set of packages.
Anyway, we have defined a script called text-stats
that calls the run()
function for us. Poetry will make
sure this script is available when we install our package. Now, you can invoke our CLI as follows:
text-stats -c -w -l "Hello, World!"
Commit what we have done.
git add .
git commit -m "Add command-line interface"
Example Dependency Usage
Finally, I want to show how you can leverage external dependencies in your project. After all, I have told you that Poetry is good at that. Let us modify the command-line interface a little bit to display the statistics with colors, using an external package called print-color.
poetry add print-color
After you run this command, it will pull and install print-color to you virtual environment, add it to the
list of dependencies in pyproject.toml
, and update the poetry.lock
file. I have not mentioned the lock
file yet. It is a plain-text file in TOML format that contains a snapshot of all the project dependencies
(both direct and indirect). It is recommended to keep the lock file under version control if you are
developing an application (as opposed to a library) as it allows to get more reproducible builds (see
documentation
for more information).
With the dependency installed, we can modify the cli.py
file as follows:
# text-stats/src/text_stats/cli.py
from argparse import ArgumentParser
import print_color
from text_stats.stats import count_characters, count_lines, count_words
def run():
parser = ArgumentParser(description="Count character, words, lines in a text file")
parser.add_argument("input", help="Text input")
parser.add_argument("-c", "--characters", action="store_true", help="Count characters")
parser.add_argument("-w", "--words", action="store_true", help="Count words")
parser.add_argument("-l", "--lines", action="store_true", help="Count lines")
args = parser.parse_args()
if any([args.characters, args.words, args.lines]):
print_color.print("Input stats:", format="bold")
if args.characters:
print_color.print(count_characters(args.input), tag="chars", tag_color="green")
if args.words:
print_color.print(count_words(args.input), tag="words", tag_color="yellow")
if args.lines:
print_color.print(count_lines(args.input), tag="lines", tag_color="red")
if __name__ == "__main__":
run()
Let us test how the output changed after this update. Run the following in your terminal to see:
text-stats -c -w -l "Hello, World!"
As usual, commit the latest changes.
git add -u
git commit -m "Add example dependency for colored printing"
Release and Distribution
Before wrapping this (already long) post up, I want to briefly mention how you can build a release of your
Poetry project and give some advice on distributing it. This topic deserves a separate post, which I might
consider writing later. Python has two standard formats for distributing packages: wheel and sdist. The latter
is short for “source distribution”, an older format that, as its name implies, is just a snapshot of the
project’s source code as a .tar.gz
archive. The former is the newer format introduced by PEP
427 – ZIP-format archive with the .whl
extension. The previous
sentence should give you a hint that you can use unzip
command if you are curious about the content of the
wheel.
Building a release of a Poetry project is very straightforward.
poetry build
This will create a dist
folder in the root of your project.
dist/
├── text_stats-0.1.0-py3-none-any.whl
└── text_stats-0.1.0.tar.gz
1 directory, 2 files
As you can see, by default both formats are produced. You can use --format=[sdist|wheel]
to generate only
one, if you want. The 0.1.0
part of the name is the project version, as defined in the pyproject.toml
file.
Poetry allows you to publish the produced release artifacts to a PyPi registry – a repository of Python packages, the most known one being the official https://pypi.org/. After you configure your credentials for the repository, publishing is as simple as running a single command.
poetry publish
Continuous Integration and Deployment
Finally, I want to note that Poetry is a fantastic tool for local development, but may be quite heavy for the
CI/CD setups. The reason I say that is because it has quite a few dependencies and takes a while to install.
It is certainly possible to have a container image with Poetry pre-installed, but many teams use default Linux
images for their pipelines and install Poetry as a preparation step. For these situations I recommend using
lighter tools, such as build (to generate a release) and
twine (for publishing). It is quite straightforward to invoke these tools
as Tox tasks, creating a unified interface for the pipelines (you would use tox -e build
and tox -e publish
at the corresponding CI/CD stages).
Summary
In this blog post I have shown how you can start using Poetry for your Python projects. I went a bit beyond general introduction and discussed how you can configure important developer productivity tools, such as linters, formatters, type checkers, and a language server. Comprehensive coverage of all aspects of a Python project life cycle would require a much longer post, so I may cover these topics in separate articles.