Dependencies Management

Managing dependencies and keeping them up-to-date is part of the maintenance of our projects. Here are the most common operations.

What we need to accomplish here is mostly what the 12 factors are teaching us:

  • Dependencies must be explicitly declared and isolated (aka not “vendored” or copy-pasted into the source code)

  • It must be simple to install those dependencies during deployment

  • It must be simple to upgrade them in the future

  • And finally the availability of those dependencies must the same as the service that we deploy

External dependencies

“External” dependencies are those who are found in public package repositories that anyone has access to and can install. This is by far the most of what we’re managing.

Python

In short:

  • PyPi — The default public repository

  • pip — The most popular package manager

  • pip-tools — Meta tools to make pip usable

Note

Pip will be replaced by Poetry when we replace Felix, but now is not the time.

Basic structure

Usually you would have two files:

  • requirements.in — Similar to the package.json from NPM, we put there the list of packages we want with loose constraints on versions (or no constraints at all to begin with)

  • requirements.txt — Must be present at the root of the repo for deployments to work. It contains the exhaustive list of packages to be installed pinned to their exact version with additionally the hash of all packages. This file is auto-generated by pip-tools

Adding a requirement

Suppose that you have this in requirements.in:

django
djangorestframework

You can process it with pip-compile --generate-hashes requirements.in to generate requirements.txt:

#
# This file is autogenerated by pip-compile
# To update, run:
#
#    pip-compile --generate-hashes requirements.in
#
asgiref==3.4.1 \
    --hash=sha256:4ef1ab46b484e3c706329cedeff284a5d40824200638503f5768edb6de7d58e9 \
    --hash=sha256:ffc141aa908e6f175673e7b1b3b7af4fdb0ecb738fc5c8b88f69f055c2415214 \
    # via django
django==3.2.6 \
    --hash=sha256:7f92413529aa0e291f3be78ab19be31aefb1e1c9a52cd59e130f505f27a51f13 \
    --hash=sha256:f27f8544c9d4c383bbe007c57e3235918e258364577373d4920e9162837be022 \
    # via -r requirements.in, djangorestframework
djangorestframework==3.12.4 \
    --hash=sha256:6d1d59f623a5ad0509fe0d6bfe93cbdfe17b8116ebc8eda86d45f6e16e819aaf \
    --hash=sha256:f747949a8ddac876e879190df194b925c177cdeb725a099db1460872f7c0a7f2 \
    # via -r requirements.in
pytz==2021.1 \
    --hash=sha256:83a4a90894bf38e243cf052c8b58f381bfe9a7a483f6a9cab140bc7f702ac4da \
    --hash=sha256:eb10ce3e7736052ed3623d49975ce333bcd712c7bb19a58b9e2089d4057d0798 \
    # via django
sqlparse==0.4.1 \
    --hash=sha256:017cde379adbd6a1f15a61873f43e8274179378e95ef3fede90b5aa64d304ed0 \
    --hash=sha256:0f91fd2e829c44362cbcfab3e9ae12e22badaa8a29ad5ff599f9ec109f0454e8 \
    # via django

Note

While generating hashes isn’t strictly mandatory it’s an important feature to be sure that when you install those packages on another computer they haven’t been tampered with. This said, Pip has a lot of bugs around those hashes, but it’s important to try and work around them nonetheless. If you still need to disable the hashes, please do it explicitly by disabling them in make venv directly.

Automating

This process can be automated through a Makefile with the following content:

pip-tools:
	$(PYTHON_BIN) -m pip install --upgrade pip pip-tools

venv: requirements.txt
	$(PYTHON_BIN) -m pip install -r requirements.txt

%.txt: %.in pip-tools
	$(PYTHON_BIN) -m piptools compile --generate-hashes $<

This way you can simply type make venv to:

  • Re-compile requirements.txt when requirements.in changed

  • And install the new packages into the virtual environment automatically

Updating versions

The goal of the maintenance is to keep up with the latest version of packages. In the Python world, APIs stay fairly stable so it’s reasonable to expect that there won’t be any major breakage if you update all at once. The simplest is then:

  1. Delete requirements.txt

  2. Do a make venv

  3. Use Git to diff requirements.txt from its previous version. You will see the updated versions. If you see major updates, check the changelogs to know what you should be aware of.

  4. Fix/test/repeat until all is well

Be particularly attentive to Django and DRF’s deprecation warnings. They usually warn a long time in advance and have easy, well-documented solutions. If you see a deprecation warning, even if the code still works, please fix it as fast as possible.

JS/HTML dependencies

It’s all in NPM!

  • NPM — The de-facto public repository

  • NPM — The default package manager (and not Yarn which is mostly unnecessary at this point). Please use at least the version 7 of NPM because of the package-lock.json format.

Basic structure

Unlike the Python structure it’s fairly standard:

  • package.json — To describe your requirements

  • package-lock.json — Auto-generated file with the exact pinned version of all packages

Adding a requirement

Simply:

npm add plyr

In terms of version specification, it is a good idea to use the semver selector ^ which will install any new non-breaking version. It’s also the default when doing npm add so all is fine.

Updating versions

To update the versions of your JS packages, the best is to have a look at

npm outdated

By example if you have in your requirements plyr at version ^1.0.0, you’ll get the following output:

Package  Current  Wanted  Latest  Location
plyr       1.8.1   1.9.0   3.6.8  my-demo-project

How to read this:

  • Current — The currently installed version

  • Wanted — The highest version you can go to while still respecting the requirements (aka the ^1.0.0)

  • Latest — The latest available version

Of course our goal is to have all the packages in Latest. However we know that in the JS world things tend to break quickly and repeatedly. So let’s have a bit of caution.

For each package (yes) check the changelogs:

  • If the update is easy-ish then update.

  • Otherwise, you need to talk to your project manager to plan an update of this specific dependency. In the meantime, update to the Wanted version nonetheless.

When you’ve decided to which version you can update a package (either the “Wanted” either the “Latest”), you can copy/paste this version number into your package.json.

In the current case of my Plyr example, I decide to update to the Wanted version. So here is my previous package.json:

{
    "dependencies": {
        "plyr": "^1.0.0"
    }
}

Here is the new content:

{
    "dependencies": {
        "plyr": "^1.9.0"
    }
}

Then I can finally do a npm install.

Don’t hesitate to do this iteratively, updating and testing each dependency one by one.

Internal dependencies

“Internal” dependencies are those which are private and internal to our projects. By example some piece of code used by two components in the same project but that don’t share the same repo.

For those cases, we have a Gemfury account that hosts our private packages. If you need an account, you can ask for one. It sounds like it’s Ruby-only (“Gem” in the name) but it actually manages other tools like Python and Node in our case.

Python

PyPi is only the default Python repository, but you can actually use several repositories in a requirements file, including a private one like Gemfury.

Making a private package

The simplest way to make a Python package, private or public, is to use Poetry.

Note

You’d think that you would need to also manage the main project with Poetry but it’s not the case. Once published, a Poetry package is completely compatible with Pip.

You can have a look at how Typefit is packaged by example, and specifically the pyproject.toml file and the fact that the source code is all inside a src directory. There is a requirements.txt file but please ignore it, it’s only here for Read The Docs compatibility. You can also observe the published package on PyPi.

Building

In order to test if your package works, you can start by building it:

poetry build

Which will produce the package inside of the dist folder. When it’s built, you can install it in another virtual environment to see if it works and if you can import things properly:

pyenv shell some-other-venv
pip install dist/typefit-0.3.1.tar.gz
python -c 'import typefit'

Of course you’ll need to adapt this to your package.

Publishing

You need to be very careful when publishing because you certainly don’t want to be publishing on PyPi (which is public) by mistake.

As a pre-requisite, first you need to edit the ~/.config/pypoetry/config.toml file with the following content:

[repositories]
[repositories.with]
url = "https://pypi.fury.io/with/"

Then you need to make sure that you have set the two following environment variables:

  • GEMFURY_TOKEN — A read token from Gemfury

  • GEMFURY_DEPLOY_TOKEN — A deploy token from Gemfury

Configure authentication for your With repo:

poetry config http-basic.dialogue $GEMFURY_DEPLOY_TOKEN $GEMFURY_DEPLOY_TOKEN

Add this to your Makefile

publish:
	poetry publish --build --repository with

And finally you can publish your package:

make publish

Depending on your package

In order to depend on this package, you need to change the requirements.in file a little bit. You need to add at the beginning of the file:

--index-url https://XXX:@pypi.fury.io/with/
--extra-index-url https://pypi.org/simple/

Note

You need to replace XXX by a read-only token from GemFury. You can commit this token as the only thing that it’s protecting is the source code. If in the source code you have a key that lets you access the rest of the source code (which is anyways stored at the same location) then so be it.

Then you can depend on your package as you would depend depend from any other package. The only thing is that you’ll probably want to pin the version directly in the .in file. By example:

entity-extractor==0.1.24

Numbering versions

When you release a package you need to decide which version number to give to it.

  • If your internal package is released synchronously with a project, you can use the same version numbers as the project. You can actually make empty releases to make sure that there is matching version numbers between the project and the dependency (like in ActiveSeed).

  • Otherwise feel free to use SemVer accordingly

This is the version numbers for versions that are released using the standard Git Flow release process (see Version Control). When you’re working on a branch and you need to update this dependency without making a real release yet (by example if you want to deploy that branch), then you can give this kind of version number:

0.9.0a42+3

Let’s decompose this:

  • 0.9.0 is the version onto which this branch is based

  • a indicates it’s an alpha release

  • 42 is the ticket number of the branch

  • 3 is a sequential release number that you increment every time you publish this branch

JS/CSS

We’ve never needed to make a private JS package so there is no procedure yet. Feel free to PR it!

Forbidden patterns

Because they want to be easy to adopt, a lot of libraries encourage you to have bad patterns. However remember that in the state goals of this document, we explicitly want to be able to list, upgrade and guarantee those dependencies and their availability.

Vendoring

A common practice is simply to copy/paste a compiled file of this dependency in your code and just call that. This isn’t good: it means that this specific dependency escapes the vision of the centralized dependency manifest (package.json by example) and thus won’t be updated when we update dependencies.

The only reason to vendor in a dependency is if it’s an unmaintained piece of code but that still does a useful job. If that’s the case, we can decide to copy it into our project and maintain it privately (if the licence allows it of course). Of course if we do that, the code must be upgraded to our coding standards (formatting, documentation, etc).

Most likely though all you want to do is integrate this dependency through NPM and require it from your code.

Imagine that you want to use Plyr. You can add it to the project

npm add plyr

And then from your JS code

import Plyr from "plyr";

const player = new Plyr("#player");

Note

Is ommited from this example the way to load the CSS because it depends on how your webpack is configured but it’s basically the same thing: just require it from a JS file.

Using a CDN

Almost all JS libraries or web font libraries are available through a CDN. That is very convenient but has a major drawback: it’s most likely illegal and moreover doesn’t let us guarantee the availability nor the performance of these components.

If the JS files are hosted on the same servers as the website that we provide then most likely if the website is up the files will be up as well. If the files are on a CDN and the CDN goes down then the site is broken and we can do nothing about it. This happens once or twice a year, a major CDN goes down and half the web gets broken. Let that not be our share of the web.

For most of your JS/CSS needs, you can refer yourself to the section above as the solution is the same.

For your web fonts needs, it’s a bit case-by-case:

  • Google Fonts — For Django/Wagtail websites, there is an easy solution embedded in django-wools. For others, you can download the fonts zip from the Google fonts website and write the @font-face CSS manually inside your code.

  • Adobe Fonts — You’re kind of screwed, this is a terrible service. If you can find the TTF file it’s good.

  • Other — Same as above. Try to obtain the TTF files somehow and to make your own @font-face.

Note

Today it makes sense to offer Woff2 and TTF as font formats. You can forget about the other ones most of the time.

Note

Using Google Fonts will also completely screw your PageSpeed score. Yes both are made by Google. Go figure…

Depending on a Git package

It’s often tempting to depend on a Git package but that’s often ending up in issues during the deployment phase. Package managers are simply not made to import Git packages.

If we’re working with a private package, please use the private package procedure as listed above.

If we’re working with a temporary patch in an external library, please:

  • Make sure to send the patch back to upstream (and follow up on it)

  • Depend not on the Git URL but on the ZIP file at the specific commit that you want (which GitHub automatically provides). It’s not ideal but at least it’s much more manageable than depending on Git directly.