[zeromq-dev] C4 and CI

John Morris john at zultron.com
Mon Nov 16 07:35:33 CET 2015


Hello list,

We've been using parts of C4 to manage the Machinekit project, 
open-source machine control software.  I love the idea of C4, and really 
buy into its basic ideas of reducing friction of the development process 
in order to scale the developer and, following that, the user community, 
the size of which is a primary indicator of project success.

However, it turns out C4 is a very challenging idea that most people 
seem to have trouble swallowing in its entirety, as I have. 
Accordingly, we really only follow parts of C4, which is a 
disappointment to me.

Part of this is my own fault.  I run a Buildbot for the project that 
builds many different configurations, i.e. combinations of OS, hardware 
architecture and real-time thread environment.  It's especially because 
of the latter that using a CI system on public infrastructure, like 
Travis CI, isn't possible:  running regression tests under the several 
RT thread systems (RT_PREEMPT, Xenomai, RTAI) requires special kernel 
support unavailable in those environments, so it's necessary to run the 
CI system on bare metal or a VM with custom kernels.  In order to 
support many OS, arch and RT environments, my Buildbot is extremely 
complex and essentially unreproducible.  As a result, despite my clear 
communication that it's a system contributed by a third-party (my 
company), the community tends to see it as the project's "official" CI 
system, dictating "officially-supported" configurations and providing 
the "official" package stream.  I shouldn't have to tell this audience 
how this undermines C4, and besides this CI system is a SPOF for the 
project and is taking too much of my own energy to maintain.

So, taking the last two years' lessons learned about what (IMO) a C4 
community needs in a CI infrastructure (esp. when public CI services 
don't make sense), I have a plan for a CI system that seems a better fit 
with the spirit of C4, and solves other practical issues at the same time.

Key to the idea is scalability by distributing the burden across many 
members of the community.  A trivially-reproducible CI system, such as a 
Buildbot instance in a Docker container (either on private hardware or 
the cloud), may be set up by any community member to build/test/package 
one single particular favorite configuration, for example Debian Jessie 
on RPi2 with RT_PREEMPT kernel pointing at the official git repo master 
branch.  Set up instructions should be as short as "install Docker; 
clone this git repo; edit the configuration; run the Docker container".

For each PR, the system builds the code for and tests it in the 
configured environment (these duties could be separated).  The 
build/test results are then aggregated (exactly how is TBD) with those 
of other contributed CI systems (with different configurations), and the 
PR is updated with a (or a list of) pass/fail status(es), which both 
Contributors and Maintainers may use to check for and diagnose problems.

For each merge, the system builds binary packages and updates a package 
repository.  This repo may be published and advertised to other 
community members interested in the same configuration.

In this way, anyone is able to set up a CI system for a particular 
configuration.  If many people do so, the burden of running a CI system 
for many target configurations will be distributed across many community 
members.

I'm so enamored with this idea partly because of how it fits with C4:

- Distributing the CI system across the community scales up the number 
of configurations built and tested without overburdening any one 
community member.
- No one person dictates what configurations are officially supported by 
the project:  anyone can contribute build results for any configuration, 
and while Contributors and Maintainers will see when a PR breaks the 
configuration, ultimately it's up to that configuration's champion to 
work with the community to ensure it continues to build.
- Third-party stabilization forks become trivial to set up and publish 
packages for; simply set up a new CI system and point it at the fork's 
repo on GitHub.  This especially suits vendors wanting to ship 
Machinekit on their machine controller hardware.

As of now, I've implemented many parts that would go into this system, 
but many other parts are missing, and it's not a top priority for me. 
That puts this idea in the "gedankenexperiment" category, but I'm still 
curious how ZeroMQ community members will react, assuming you've made it 
this far into the mail!

An update since I started writing this:  My Buildbot's ARM builder 
broke, and I've decided not to resuscitate.  Some others in the project 
have decided to set up a new CI system based around OpenSUSE's public 
OBS instance.  My earlier experiments with building Debian packages on 
OBS were ugly, but it turns out it can be made to work.  Building on OBS 
satisfies some of my proposed requirements for a C4 project CI system, 
especially that it is trivially reproducible.  However, in practice, it 
will take a lot of work to nail down the entire build flow, and the 
question of running unit tests in various environments is unresolved.

	John



More information about the zeromq-dev mailing list