Secure yourself against Python dependency confusion
Dependency confusion is a tricky business in Python land, especially if you are an
organisation that maintains a private Python package repository. There are not many
options. The obvious first: by specifying --extra-index-url
, pip
will contact the
extra index and the official index on pypi.org, check the
versions and install the »highest« one. I did not look up what happens when there is the
same version available on the private and public index, but I guess not that what you
want or expect. So what options are left:
Use your own repository/index.
Specify the index in your pip.conf
# pip.conf
[global]
index-url = https://my.index.org/...
or use the --index-url
parameter (not to be confused with --extra-index-url
).
This setup gives you the advantage that you are fully in control. However, you need to
setup and maintain everything. And the most important issue: only one glitch in
network connectivity, a forgotten index parameter or pip configuration entry is needed
and pip will fall back to pypi.org
Use hashes
Somewhat counter-intuitive, but unfortunately reality: it is not possible to specify
hashes in pyproject.toml
. Only with the help of
pip-tools, you can create a hashed version of
your requirements.
pip-compile --generate-hashes --output-file=requirements.txt pyproject.toml
Add this file to your git repo. Install using
pip install --require-hashes -r requirements.txt
.
With hashes you have a relatively secure approach, as long as you use requirements.txt
(and not pyproject.toml
) to install your dependencies. Of course you need to make sure
that you pick the right indices to begin with. A big disadvantage is the initial setup
and maintenance effort: hashes require time and care.
pypi organisations
In April 2023,
pypi started to support organization accounts.
Especially for organisations or bigger companies this is a desperately needed feature:
namespaces can be reserved on the official pypi index. With a reserved namespace,
you can be sure that you’ll not install some malicious package by accident. The
maintenance is comparatively low, you can safely add your private package index via
--extra-index-url
. No extra changes in projects that use your private packages are
needed.
Still, you probably need to have a certain (organisational) size to request an »organisational account«.
Conclusion
These are options, but in my opinion the whole concept of dependency confusion due to selecting the wrong index is a structural issue of pip (or the python ecosystem). The first step would be to assess potential threats and their impact (also known as Threat Modeling) and then make a decision on what to do. You might conclude that you don’t need any additional steps, which is fine, as long as it is a conscious decision.