Semver: A Primer

by:

in How To on Sep 02 2014

Semantic Versioning, otherwise known as semver has become a core part of Node.js software development. Thanks to npm, semver is embedded in the way we publish and link packages together to form simple libraries or complex applications. The relationship Node.js has with semver is evolving over time, just as the semver specification itself is evolving.

We'll be exploring semver in a series of articles starting with this primer. It's important that we, as the Node.js community, understand semver since it plays such a significant role in defining the way we build software.

What Is Semver?

Semver is a specification outlining a method of encoding the nature of change between releases of a "public interface", directly into the version string.

A public interface could be anything from an application programming interface (API), a command-line interface (CLI) or a graphical user interface (GUI). Anything that a third-party depends on having predictable interactions with should be versioned with semver. Semver could even be extended to physical interfaces, but we'll leave that as an exercise for your imagination.

Semver is a scheme for interface versioning for the benefit of interface consumers, thus if a tool has multiple interfaces, e.g. an API and a CLI, these interfaces may evolve independent versioning. Although many applications do not consider their CLI to be part of their interface when versioning, a third-party may depend on specific CLI behaviour in the same way they might depend on an API.

Semver Construction

A semver-compatibile version is built from three numbers separated by periods (.). The three numbers are referred to as major, minor and patch, and specified in that order. The combination of numbers represent an ordered version, where each of the three numbers are also ordered. A major version has a series of ordered minor versions, and a minor version has a series of ordered patch versions.

So:

Version 0.3.10 is ordered before 0.10.3
Version 0.1.1 is ordered before 1.0.0
Version 1.100.100 is ordered before 10.10.10

The semantic distinction between major, minor and patch is described succinctly at semver.org as:

Given a version number MAJOR.MINOR.PATCH, increment the:

MAJOR version when you make incompatible API changes,

MINOR version when you add functionality in a backwards-compatible manner, and

PATCH version when you make backwards-compatible bug fixes.

Semver is important in Node.js because it's built into the way that npm manages package dependencies. What's more, semver ranges are almost universally used by package authors to define what dependency versions they want their packages to be bundled with when installed.

All packages published to npm are assumed to follow semver semantics.

Perhaps most ironically, npm is a poor example of semver adherence, but npm's complications with semver are historical, similar to Node.js. However the situation has been improving since the release of npm 2.0.0.

Semver Ranges

The concept of semver ranges as used by npm was inspired by Bundler, the npm of the Ruby ecosystem. For a Ruby application, semver ranges have a greater impact than they do in Node.js.

In Ruby, as in many other software platforms, only a single, global version of a gem (package) can be loaded throughout an entire application. Semver enables Bundler to perform the crucial step of negotiating a single agreeable version that satisfies all dependants simultaneously. If Bundler cannot find a single version of a dependency that simultaneously satisfies all dependants, the dependency simply cannot be installed without force.

Nowhere in the semver specification is there any explicit indication of how to consume semantically versioned packages. Installation strategies and range shorthands such as `*`, `~` and `^` are constructs introduced by semver implementations and package managers.

Node.js is a "no batteries included" platform—in order to use Node.js effectively you must opt-in to using third-party packages. It is not unusual to use tens, if not hundreds, of dependencies within a single project. Semver ranges are, arguably, essential for enabling pragmatic dependency management.

When specifying a dependency, you can choose to use a fixed version number or a semver range. When using fixed versions, only that version will be installed, though note this does not fix ranges defined in dependencies of your dependencies. Fixed version ranges should be avoided for reasons explained later in this article.

Semver ranges exist to permit newer versions of a package to be automatically installed automatically. This is particularly useful when you're dealing with deeply nested dependencies. Important bug fixes can be distributed to dependants, and dependants of dependants simply by signalling via the semver range. More about this later.

The simplest semver range is the "*" range which accepts any version available, defaulting to the "latest". "*" should be avoided as it will happily install packages with across major versions i.e. with breaking changes.

The next form of a semver range specifies a single major version, or a major and minor version. "2" covers all minor and patch versions less than 3 and "2.4" covers all patch versions less than 2.5. These ranges can also be achieved more explicitly with an x or an * in variable patch and minor positions. For example: "2.x.x" or "2.4.*".

Additionally, ranges can be specified explicitly with -, <, <=, > and >=. For example:

"1.2.3 - 2.3.4" is the same as ">=1.2.3 <=2.3.4" which specifies that the range can include all versions from, and including 1.2.3 all the way up to, and including 2.3.4.
">=1.2.0 <1.3.0" is be similar to "1.2.x" (but not exactly the same, thanks to pre-release and metadata labels which are beyond the scope of this article).
"<1.0.0" only accepts versions in the "0.x.x" range.

Tilde & Caret Shorthand

Node.js' implementation of semver also introduces shorthand ranges: ~ (tilde) and ^ (caret). The general explanation for how these work is:

Prefixing a single semver version string with the ~ character defines a range of acceptable versions that include all patch versions from the one specified up to, but not including, the next minor version. "~1.2.3" can be approximately expanded as ">=1.2.3 <1.3.0".
Prefixing a single semver version string with the ^ character defines a range of acceptable versions that include all patch and minor versions from the ones specified up to, but not including, the next version. So "^1.2.3" can be approximately expanded as ">=1.2.3 <2.0.0".

0.x.x Versions

Complications arise with the use of 0.x.x versions, where the rules get messy due to the nature of the special 0 major version number in the semver specification. The major version 0 is supposed to be reserved for "initial development", where "anything may change at any time", so the "patch" and "minor, non-breaking changes" essentially have no meaning.

Unlike ~, the ^ operator with a major version of 0 is essentially a no-op, in that it translates to exactly that version rather than a full range. So "^0.2.3" is equal to just "0.2.3" and no more.

There was some discussion of changing the semantics of the ~ operator for 0.x.x versions but it's too late to make that change now.

The Case For Semver Ranges in Node.js

Initially, it might be difficult to see why ranges need to be a thing at all. But consider a scenario where a dependency three levels deep in your application is updated to include a critical bug-fix:

fruitshop-app
  └─┬fruit@1.0.0
    └─┬apple@1.0.0
      └──seed@1.0.0 < needs critical bug-fix

A bug-fix release should occur as a patch bump, so seed@1.0.0 would be replaced with seed@1.0.1 when the fix is published.

Why you never use fixed semver in libraries

If only fixed versions were used in the package.jsons for each package, for fruitshop-app to receive the seed@1.0.1 bug-fix, the following sequence must be executed in series:

seed fixes the bug and publishes seed@1.0.1
apple updates to seed@1.0.1 and publishes apple@1.0.1
fruit updates to apple@1.0.1, publishes fruit@1.0.1
fruitshop-app updates to fruit@1.0.1
fruitshop-app finally recieves seed@1.0.1 through fruit@1.0.1 and apple@1.0.1 on next clean npm install.

There is no way to shortcut this without hacks. It's not hard to imagine how poorly the pattern scales as the number of packages increases: whenever any dependency in the hierarchy is updated, every parent in the chain using fixed versions must release a new version.

The timeline for updates to bubble up can, and does, take weeks or months, particularly in such a diverse and distributed ecosystem like Node.js. It may involve multiple authors of varying levels of responsiveness and willingness.

Fixed versioning slows progress to a crawl and requires increased micro-management of dependency versioning. Thankfully fixed versioning is not widespread.

Now consider, if apple instead used a flexible patch range via the ~ operator:

{
  "name": "apple",
  "version": "1.0.0",
  "dependencies": {
    "seed": "~1.0.0"
  }
}

Compare the workflow required for fruitshop-app to receive the seed@1.0.1 bug-fix:

seed adds bug-fix and publishes seed@1.0.1
fruitshop-app gets seed@1.0.1 on next clean npm install because apple accepts all patch versions within 1.0.x

That's it. None of the intermediate packages need be involved.

A responsible Open Source community member might follow-up with pull requests to the intermediate packages to update their minimum versions, but this can be done in parallel and does not prevent our application from consuming the updated package.

Temporary Fixes and Forks

Of course, in the above scenario one can entirely step around semver and hack together temporary fixes to packages then distribute the "fixed" version of the dependency using some different approaches:

Using package.json's "bundledDependencies", such as in npm itself (note that npm has good reason to do this, so you don't need a package manager to install the package manager!), this only works for packages passed through npm publish.
Remote package URLs instead of versions, such as a recent update to the level-sublevel package which required an updated, but not-yet-released version of the levelup package.
Publishing your own "fixed" version to npm; the registry is littered with duplicate packages where small changes are required because of uncooperative maintainers or disagreements. They are usually indicated by a "-username" in the name where the re-publisher tries to make it clear it's a simple fix-fork. A cleaner and more modern approach is to use scoped packages.

In all cases you also need to remember to swap things back eventually if and when the update has propagated.

Also consider that as a package author, you are unlikely to even know that a critical bug was fixed in a dependency of a dependency of a dependency. Keeping abreast of such changes, across all of your dependencies would require constant attention, and much better communication between package authors. This does not scale!

Keeping Downstream Users Informed

Ideally, only bug-free versions of packages would be used as dependencies. Until recently, npm permitted publishing new code over the same version using npm publish --force. This was commonly used to publish over a broken version of a package, but this effectively defeats the entire purpose of software versioning:

"assigning … unique version names … to unique states of computer software" (source)

With this in mind, npm publish --force no longer permits publishing different code with the same version. The registry guarantees it will deliver the same artifact for the same version string, unless it is unpublished, in which case you won't get anything.

If you identify a bug, just bump the patch version and publish again, this is no big deal for dependants using flexible semver ranges. When doing this, also consider whether it makes sense for dependants to ever use the previous, buggy version again. If the bug is serious enough, then after publishing the bug-fix, npm deprecate the buggy version(s), ideally with a message explaining why the current version was deprecated:

$ npm deprecate my-thing@"< 0.2.3" \
  "critical bug fixed in v0.2.3, see http://link.to/more-info"

Deprecation should be used over unpublishing, since deprecation only produces a warning on installation, rather than preventing installation entirely. Unpublishing should be reserved only for catastrophic emergencies which simply must not be installed such as an accidental rm -rf /.

Semver Caveats

There's some dichotomy between the machine-enforced, rigid consumption of semver by npm, and the entirely unpoliced act of adhering to semver when publishing. Semver will always be potentially error prone while humans are responsible for adhering to the specification.

Semver is an idealist that simply ignores the fallibility of humans—consumers are entirely at the mercy of package whether authors follow semver properly. On the other hand, human fallibility is one of the very problems that semver attempts to smooth-over—by allowing bug-fixes to be installed transparently.

What if semver allows a regression or a critical bug, such as security vulnerability, to be installed?—Semver skeptic

While this is a valid concern, the responsibility for managing what code is deployed into production is in the hands of the developers, not npm. Semver is a tool for development only. In other words, if you are worried about semver introducing bugs in production, you're using semver wrong!

There are multiple ways to deal with versioning for deployment:

Bundling dependencies using package.json's "bundledDependencies"
Using npm shrinkwrap to create a fixed-in-time snapshot of the dependency hierarchy
Checking dependencies into version control along with the application

Discussion of these options, and more, will have to be left to future articles.

In the next article on semver, we'll take a closer look at the ^ operator for specifying semver ranges in package.json. This is the new default for saving version ranges but is currently not well understood.

The NodeSource Blog

Semver: A Primer

What Is Semver?

Semver Construction

Semver Ranges

Tilde & Caret Shorthand

0.x.x Versions

The Case For Semver Ranges in Node.js

Why you never use fixed semver in libraries

Temporary Fixes and Forks

Keeping Downstream Users Informed

Semver Caveats

Featured Articles

Categories

All Posts

How To

Semver: A Primer

What Is Semver?

Semver Construction

Semver Ranges

Tilde & Caret Shorthand

0.x.x Versions

The Case For Semver Ranges in Node.js

Why you never use fixed semver in libraries

Temporary Fixes and Forks

Keeping Downstream Users Informed

Semver Caveats

Featured Articles

Categories