Semantic versioning

2012-10-30

Semantic versioning seems widespread and regarded as a good way to describe the purpose of certain library and program versions. Each release (version) receives a version number based on it's effect on the public API.

A version consists of three parts X, Y and Z. The version number is formatted like X.Y.Z. Bug fixes (not affecting the public API) increase Z. Backwards compatible changes to the public API increase Y and backwards incompatible changes increase X.

This is a good thing of course. It is clear that whenever X raises, packages dependent on the public API could break. The main idea is that whenever the public API of a system changes, that those changes are communicated by increments in the specific part of the version number.

Dependency hell / Diamond problem and friends

Semantic versioning tries to fix the "dependency hell", a unpleasant situation for everyone who has to deal with and manage dependencies within a project or even operating system.

While one - for example - would want to provide the latest and greatest version of software A and B those could have different requirements regarding the API of library C (which both use). When A requires a newer version of C than B, and both versions of C are incompatible, there is no easy and safe way to move on. There are plenty of other examples and problems one can encounter there.

The diamond problem, long dependency chains, circular dependencies, arbitrary dependencies ...

Another solution

While semantic versioning - as proposed - surely provides a way to clearly state that something dramatically changed within the API of some piece of software, it still does not allow for several major API versions to coexist out of the box.

One solution which actually does allow multiple incompatible versions of the "same" software to coexist would be to integrate the major version number in the name. Actually this is done for a long time already.

Think of:

kdelibs3 - kdelibs4
apache - apache2
beautifulsoup - bs4

and many, many more. This naming practice is typically used with libraries because other packages depend on them.

IMHO, if this practice would be more widespread, it would be less likely to encounter dependency hell.

Referring back to the above example, software A and B could safely depend on the incompatible versions of package C if both versions had another name and thus different top level paths, namespaces a.s.o.

There are a few obvious drawbacks though:

more work has to be done for backwards incompatible changes; package name and all related information has to change
as a result the bar for backwards incompatible changes is risen
as a result innovation could probably be held back

On the other hand, it can be refreshing and motivating too, to start with a new package name and without the burden of thinking about incompatibilities (yet).

Furthermore this could stimulate thinking about the API upfront and could maybe lead to more overall stability/predictability of software (development).

Something along the lines of: Incompatible changes change the package name, backwards compatible changes and bug fixes increase the version number. The version number could even be reduced to one number/identifier - following the revision numbers in your VCS-system of choice.

Maybe it is even more "semantically correct" to do it like this. Isn't a major version of a piece of software something really different? Why not name it differently?