[zeromq-dev] C4 and CI
John Morris
john at zultron.com
Mon Nov 16 07:35:33 CET 2015
Hello list,
We've been using parts of C4 to manage the Machinekit project,
open-source machine control software. I love the idea of C4, and really
buy into its basic ideas of reducing friction of the development process
in order to scale the developer and, following that, the user community,
the size of which is a primary indicator of project success.
However, it turns out C4 is a very challenging idea that most people
seem to have trouble swallowing in its entirety, as I have.
Accordingly, we really only follow parts of C4, which is a
disappointment to me.
Part of this is my own fault. I run a Buildbot for the project that
builds many different configurations, i.e. combinations of OS, hardware
architecture and real-time thread environment. It's especially because
of the latter that using a CI system on public infrastructure, like
Travis CI, isn't possible: running regression tests under the several
RT thread systems (RT_PREEMPT, Xenomai, RTAI) requires special kernel
support unavailable in those environments, so it's necessary to run the
CI system on bare metal or a VM with custom kernels. In order to
support many OS, arch and RT environments, my Buildbot is extremely
complex and essentially unreproducible. As a result, despite my clear
communication that it's a system contributed by a third-party (my
company), the community tends to see it as the project's "official" CI
system, dictating "officially-supported" configurations and providing
the "official" package stream. I shouldn't have to tell this audience
how this undermines C4, and besides this CI system is a SPOF for the
project and is taking too much of my own energy to maintain.
So, taking the last two years' lessons learned about what (IMO) a C4
community needs in a CI infrastructure (esp. when public CI services
don't make sense), I have a plan for a CI system that seems a better fit
with the spirit of C4, and solves other practical issues at the same time.
Key to the idea is scalability by distributing the burden across many
members of the community. A trivially-reproducible CI system, such as a
Buildbot instance in a Docker container (either on private hardware or
the cloud), may be set up by any community member to build/test/package
one single particular favorite configuration, for example Debian Jessie
on RPi2 with RT_PREEMPT kernel pointing at the official git repo master
branch. Set up instructions should be as short as "install Docker;
clone this git repo; edit the configuration; run the Docker container".
For each PR, the system builds the code for and tests it in the
configured environment (these duties could be separated). The
build/test results are then aggregated (exactly how is TBD) with those
of other contributed CI systems (with different configurations), and the
PR is updated with a (or a list of) pass/fail status(es), which both
Contributors and Maintainers may use to check for and diagnose problems.
For each merge, the system builds binary packages and updates a package
repository. This repo may be published and advertised to other
community members interested in the same configuration.
In this way, anyone is able to set up a CI system for a particular
configuration. If many people do so, the burden of running a CI system
for many target configurations will be distributed across many community
members.
I'm so enamored with this idea partly because of how it fits with C4:
- Distributing the CI system across the community scales up the number
of configurations built and tested without overburdening any one
community member.
- No one person dictates what configurations are officially supported by
the project: anyone can contribute build results for any configuration,
and while Contributors and Maintainers will see when a PR breaks the
configuration, ultimately it's up to that configuration's champion to
work with the community to ensure it continues to build.
- Third-party stabilization forks become trivial to set up and publish
packages for; simply set up a new CI system and point it at the fork's
repo on GitHub. This especially suits vendors wanting to ship
Machinekit on their machine controller hardware.
As of now, I've implemented many parts that would go into this system,
but many other parts are missing, and it's not a top priority for me.
That puts this idea in the "gedankenexperiment" category, but I'm still
curious how ZeroMQ community members will react, assuming you've made it
this far into the mail!
An update since I started writing this: My Buildbot's ARM builder
broke, and I've decided not to resuscitate. Some others in the project
have decided to set up a new CI system based around OpenSUSE's public
OBS instance. My earlier experiments with building Debian packages on
OBS were ugly, but it turns out it can be made to work. Building on OBS
satisfies some of my proposed requirements for a C4 project CI system,
especially that it is trivially reproducible. However, in practice, it
will take a lot of work to nail down the entire build flow, and the
question of running unit tests in various environments is unresolved.
John
More information about the zeromq-dev
mailing list