Faster builds with highly par­al­lel GitHub Actions

It’s a near-uni­ver­sal truth of con­tin­u­ous in­te­gra­tion builds: no mat­ter how fast they are, they’re just never quite fast enough.

On the Testcontainers pro­ject, we feel this pain par­tic­u­larly acutely. Testcontainers is an in­te­gra­tion test­ing tool, so to test it ac­cu­rately we have to have a wide set of in­te­gra­tion tests. Each in­te­gra­tion test can in­volve pulling Docker im­ages, start­ing up servers and wait­ing for them to be ready. To make things even more dif­fi­cult, we sup­port a fairly eye-wa­ter­ing range of in­te­gra­tions. For ex­am­ple, just for our data­base sup­port we have to test against 14 dif­fer­ent data­bases, rang­ing from lean to … pretty heavy­weight.

This all adds up to a very long build!

As a hum­ble open source pro­ject with­out com­mer­cial back­ing, we’re also keen to keep costs down. This means we rely on the gen­eros­ity of cloud CI providers’ free plans, which are of­ten re­source con­strained.

Build op­ti­mi­sa­tions in the past #

In to­tal, we cur­rently have 51 Gradle sub­pro­jects within our build. A cou­ple of core com­po­nents are de­pended upon by a range of mod­ules, each of which is a Testcontainers im­ple­men­ta­tion for a par­tic­u­lar prod­uct.

Long ago, we man­aged to elim­i­nate some of the build time us­ing the myniva/gradle-s3-build-cache re­mote Gradle cache plu­gin. Every build on our master branch pop­u­lates a glob­ally-read­able Gradle cache, so that CI jobs and de­vel­op­ers work­ing lo­cally do not need to re­build, or retest, un­changed mod­ules. This im­proved our best-case build per­for­mance — if most of the 5 jobs did not in­volve changes, they would be fairly short. But a change to the core mod­ule (that all oth­ers de­pend upon) would cause all 5 jobs to run a full re­build of their mod­ules.

At an­other point in the past we had man­u­ally de­fined 5 sep­a­rate jobs on our pri­mary CircleCI build: core, examples, selenium mod­ule, jdbc mod­ules, and a no-jdbc-no-selenium mod­ule — ba­si­cally every­thing else.

A fairly typ­i­cal build on CircleCI. It’s not as slow as it would be with­out par­al­leliza­tion, but there’s clear room for im­prove­ment

These jobs might have been rea­son­ably bal­anced back then, but over time as we’ve added user-con­tributed mod­ules some build jobs have grown. For ex­am­ple, the no-jdbc-no-selenium job was a whop­ping 27 min­utes if a full build was re­quired.

We ob­vi­ously needed to re-bal­ance our par­al­lel build jobs — that is, do a bet­ter job of bin-pack­ing our Gradle sub­pro­jects into a set of par­al­lel jobs. We could do that man­u­ally, right? Well, it might work at first, but it seems in­evitable that we’d end up in a sim­i­lar sit­u­a­tion one day as we add new mod­ules or test du­ra­tions evolve. We quickly re­alised that we could do bet­ter.

Variation leads to waste: even run­ning all five jobs in par­al­lel, the re­sults from the fastest build jobs are es­sen­tially mean­ing­less be­fore the longest job com­pletes. While wait­ing for the longest job to com­plete we’re es­sen­tially wast­ing time

Leave the bin-pack­ing to a queue #

What if we es­sen­tially adopted queue-based load lev­el­ing: split our build into far more than five jobs, and let the CI ex­ecu­tors com­pete to pick up new jobs as soon as there’s ca­pac­ity? Our bin-pack­ing prob­lem would be solved au­to­mat­i­cally, with­out up-front de­sign.
If we were to cre­ate a dis­tinct build job for each Gradle sub­pro­ject, we could achieve the tight­est bin-pack­ing:

Fine-grained jobs bin-pack far bet­ter, even if ex­e­cu­tion times vary sig­nif­i­cantly

Iterative im­ple­men­ta­tion #

We chose to try this pat­tern as part of our work to mi­grate our main CI jobs to GitHub Actions.

An ini­tial (hacky!) bash script proved the con­cept. The script gen­er­ated a mas­sive work­flow YAML file with a job that would run the check task in each Gradle sub­pro­ject. In the­ory, this bash script could be run pe­ri­od­i­cally, when­ever a new sub­pro­ject is added to our build. But we could do bet­ter than that!

We quickly it­er­ated on this, ex­ploit­ing an ex­tremely pow­er­ful fea­ture of GitHub Actions work­flows: dy­namic build ma­tri­ces. Simply put, this al­lows one job in a work­flow to pro­gra­mat­i­cally gen­er­ate a ma­trix of pa­ra­me­ters to be run in a sub­se­quent job. The GitHub Actions doc­u­men­ta­tion gives an ex­am­ple.

Our im­ple­men­ta­tion looks a lit­tle like this (summarized):


{% raw %}

name: CI

pull_request: {}
push: { branches: [ master ] }

runs-on: ubuntu-18.04
# Declare our output variable
matrix: ${{ steps.set-matrix.outputs.matrix }}
- id: set-matrix
# The below outputs a JSON array of check tasks for each subproject
# and uses GitHub Actions magic (::set-output) to set an output
# variable
run: |
TASKS=$(./gradlew --no-daemon --parallel -q testMatrix)
echo $TASKS
echo "::set-output name=matrix::{\"gradle_args\":$TASKS}"

# We need the other job's output
needs: find_gradle_jobs
fail-fast: false
# Read the variable, parsing as JSON, so that `matrix` becomes a
# list of check tasks
matrix: ${{ fromJson(needs.find_gradle_jobs.outputs.matrix) }}
runs-on: ubuntu-18.04
- name: Build and test with Gradle (${{matrix.gradle_args}})
# Matrix execution will cause the below to be run many times,
# one for each check task to be run
run: |
./gradlew --no-daemon --continue
--scan --info ${{matrix.gradle_args}}

{% en­draw %}

The testMatrix task is a cus­tom task which emits the list of sub­pro­jects’ check tasks in JSON for­mat, and looks like:

gradle/ci-support.gradle (a sub­set)
task testMatrix {
project.afterEvaluate {
def checkTasks = subprojects.collect {
}.findAll { it != null }

doLast {

Part of the list of build jobs generated on-the-fly

As a re­sult of this small amount of script work, we have an au­to­mat­i­cally gen­er­ated list of jobs for GitHub Actions to ex­e­cute.
This list will never go out of date, be­cause it is based on Gradle’s own view of the sub­pro­jects.

It gets bet­ter #

Testcontainers co-main­tainer Sergei Egorov is a Gradle ma­gi­cian, and de­liv­ered the ic­ing on the cake…

With our new dy­namic ma­trix in place, we’d have a job for every Gradle sub­pro­ject. Many of these might ex­e­cute quickly if they found that a cached re­sult al­ready ex­isted. It’s per­haps a lit­tle waste­ful to have CI jobs that do noth­ing, though.

Sergei quickly re­alised that Gradle’s build cache mech­a­nism al­ready has the abil­ity to de­tect which sub­pro­jects have been mod­i­fied or need to be tested. This could be used by our testMatrix task, to avoid gen­er­at­ing a CI job al­to­gether for un­changed mod­ules. After some amend­ments, the fi­nal testMatrix task works ex­tremely well in pre­vent­ing un­nec­es­sary CI jobs. For ex­am­ple, changes to doc­u­men­ta­tion or leaf-node’ mod­ules can ex­e­cute in a far faster time­frame.

Summary: What have we done? #

The re­sults so far #

In short, we’re see­ing mas­sive im­prove­ments in build times for PRs 🎉

Here’s one ex­am­ple, a de­pen­dency bump in the Localstack mod­ule. Like many PRs, this af­fects a sin­gle mod­ule:

Localstack mod­ule CI tim­ings

We can see:

As-is, this PR had com­plete feed­back in 5 min­utes — a dras­tic re­duc­tion from the build times we were see­ing pre­vi­ously!

We per­ceived that build times had im­proved, but is this truly the case? Let’s analyse some re­cent builds.

Quantitative analy­sis #

Is there an im­prove­ment upon our orig­i­nal build jobs? #

The plot be­low com­pares build time du­ra­tion be­tween our pre­vi­ous and new CI jobs. As hoped for, and match­ing our sub­jec­tive ex­pe­ri­ences, there is a dra­matic im­prove­ment:

Distribution of build du­ra­tions, in sec­onds
min 25% 50% 75% max
CircleCI 35 615 824 1831 75814
GitHub Actions 79 365 500 1539 2711

Is there a dif­fer­ence in run du­ra­tion be­tween suc­cess­ful and failed builds? #

CI builds have two roles to play: pro­vid­ing as­sur­ance that a PR/commit is re­li­able, and pro­vid­ing a sig­nal when it is not. Clearly we’d like both of these sce­nar­ios to be quick.

Intuitively, PRs with­out many changes will tend to run the fewest tests and suc­ceed more of­ten. This ap­pears to be the case in our data. Builds that go on to fail tend to take longer to do so.

We’re happy that many suc­cess­ful builds com­plete quickly, but less happy that there’s slower feed­back for build fail­ures.

Distribution of build du­ra­tions, in sec­onds

What are the slow­est mod­ules to build? #

Recall that our no-jdbc-no-selenium miscellaneous mod­ules’ build used to be the longest run­ning, at up to 27 min­utes. Having split into more par­al­lel mod­ules, we’ve re­moved this bot­tle­neck on our build per­for­mance, but has the bot­tle­neck shifted else­where?

Analysing our new build du­ra­tions on a per-mod­ule ba­sis we can see that it has in­deed:

Build du­ra­tion of slow­est ten mod­ules, in sec­onds.
du­ra­tion job­name
me­dian count
check (:testcontainers:check) 1123.0 62
check (:mysql:check) 513.0 54
check (:selenium:check) 484.0 53
check (:db2:check) 436.5 54
check (:ongdb:check) 404.0 1
check (:mariadb:check) 343.5 54
check (:presto:check) 331.0 55
check (:cassandra:check) 322.0 53
ad­di­tion­al_checks 311.0 1
check (:docs:examples:junit4:generic:check) 288.0 50

We can see that the :testcontainers­:check build job (which is our core mod­ule) now takes the longest to build, with a me­dian du­ra­tion of around 19 min­utes. This means that our build time for PRs that mod­ify the core mod­ule are still go­ing to take at least this much time, even though the other mod­ules will quickly run in par­al­lel.

Not every PR touches core, but when they do it’s likely to take some time.

We be­lieve this ac­counts for the bulge’ of build du­ra­tions seen in our dis­tri­b­u­tion plots above be­tween 1000-1500 sec­onds — which we’d like to try and re­move.

So, with this in mind, our next steps will fo­cus on the core mod­ule’s test per­for­mance: mak­ing our tests more ef­fi­cient, or split­ting the mod­ule’s tests in a way that helps us run them in par­al­lel.

Conclusion and next steps #

Go forth and par­al­lelise!

