jenkins slave setup and artifact re-use

List overview All Threads
Download

newer

older

OsmoBTS after NITB split

virtphy and MT SMS

Neels Hofmeyr

5 Sep 2017 5 Sep '17

5:42 p.m.

Hi Blobb,

I'd like to probe your opinion on a discussion we had today about our jenkins. So far our setup was manual, and we would like to (somewhat) automate the process of providing build dependencies on slaves.

One solution that was discussed longer than others would be to use docker. Each of our repositories that need a build would have their own docker file, containing the complete setup of dependencies. The idea is that anyone can easily setup an identical build on any new jenkins build slave or even at home; no complex config of the jenkins build slave is needed.

The point being, if we adopt docker in such a way, it would be logical to make use of the docker cache to save unnecessary rebuilds. It is a generic solution instead our artifact store.

I feel a bit bad for accepting your contributions, doing review and keeping you busy, just to then talk about docker to solve the problem instead; I appreciate your presence and would like to keep you involved.

Interestingly enough, we are experimenting with the artifact store on that one build job that has already been using docker for quite some time... (It was for the separate network space, not really for artifacts.)

In any case, I would like to include you in the discussion, and maybe you would also like to be involved in maturing the idea? Until now it is still wild and no-one has taken actual steps.

An example to follow would be laforge's recently added https://git.osmocom.org/docker-playground/tree/

One interesting bit is that it has a method to check whether a given git branch has changed, and rebuilds the docker image only if it has: https://git.osmocom.org/docker-playground/tree/osmo-ggsn-master/Dockerfile#n...

ADD http://git.osmocom.org/openggsn/patch/?h=laforge/osmo-ggsn /tmp/commit

This line fetches the given URL (in this case the latest patch on that branch) and considers the docker image as unchanged if that URL shows the same as last time. As soon as a new patch shows, things are rebuilt.

In this sense we could have docker images cascading on top of each other, adding individual dependencies and reusing identical states auto-detected by docker. All build steps would be in the Dockerfile.

For builds that aren't used by other builds (like the "final" programs, osmo-msc, osmo-sgsn, osmo-bsc,...) we don't need to store the result, so don't need to include the program's build in the Dockerfile: on a docker image with all dependencies, run the final build step by invoking 'docker run', like we currently do for the OpenBSC-gerrit job, and then just discard the changes.

Remotely related: we have the osmo-gsm-tester that is running binaries produced by jenkins to do automated tests on real GSM hardware. Currently we compile and tar the binaries, copy them over, extract, set LD_LIBRARY_PATH and run: a bit tedious and problematic e.g. for mismatching debian versions. This could be simplified by docker by guaranteeing a fixed operating system around the binary, actually using hub.docker.com (or maybe one day a private docker hub) instead of copying over binary tars manually, sharing across any number of build slaves, and with the added bonus of having the resulting binaries run in a separate network space.

As I said, on the one hand I appreciate our work on the artifact store, on the other hand the docker way undeniably makes for a good overall solution to simplify things in general, with artifact re-use coming "for free"...

One advantage of the artifact store though is that the artifacts we manage are not entire debian installations but just a few libs and executables in a tiny tar.

What is your opinion?

Attachments:

signature.asc (application/pgp-signature — 819 bytes)

Show replies by date

André Boddenberg

6 Sep 6 Sep

2:05 p.m.

Hi Neels,

...

...
In any case, I would like to include you in the discussion, and maybe you would also like to be involved in maturing the idea?

Thanks and sure, I already excitingly read mails with similar topics like lxc and Jenkins YAML jobs. The latter will be commented soon.

...

...
This line fetches the given URL (in this case the latest patch on that branch) and considers the docker image as unchanged if that URL shows the same as last time. As soon as a new patch shows, things are rebuilt.

Great idea! So, the hourly/nightly jobs would "docker build..." instead of "docker run..."?

Will there be one Dockerfile per each branch or is planned to use docker's "ARG and "--build-arg" to pass branch while building?

Furthermore, the nightly package of libosmocore-dev confuses me, especially when thinking about gerrit jobs. How often are these packages updated?

...

...
In this sense we could have docker images cascading on top of each other, adding individual dependencies and reusing identical states auto-detected by docker. All build steps would be in the Dockerfile.

Afaiu images will be rebuild if a new patch is introduced. But who is invoking the rebuild when the parent or libosmocore-dev in the example have changed?

Sharing same layer for "RUN apt-get install ..." command as shown in osmo-nitb-master and osmo-bts-master Dockerfile could be promising. But only if above mentioned rebuilding mechanism is smart enough to build only one image first so following will reuse its layer.

In general I like the "move" towards docker compared to lxc, which does not provide something similar to a Dockerfile.

On the one hand the described (free) benefits sounds really promising. On the other hand I am skeptic about the whole life cycle, which imo needs some external management as described to keep everything up to date. Additionally, every "docker run ..." command would need a "docker pull ..." before to ensure latest image from repository.

I will definitely setup some build jobs on my Jenkins with those Docker images to get a better understanding.

P.S.: >> I feel a bit bad for accepting your contributions, doing review and keeping you busy

No worries at all! :)

Harald Welte

3:05 p.m.

Hi Andre,

On Wed, Sep 06, 2017 at 02:05:16PM +0200, André Boddenberg wrote:

...

...
...
This line fetches the given URL (in this case the latest patch on that branch) and considers the docker image as unchanged if that URL shows the same as last time. As soon as a new patch shows, things are rebuilt.

Great idea! So, the hourly/nightly jobs would "docker build..." instead of "docker run..."?

no, every 'docker run' job would be preceeded with a 'docker build', which would either rebuild the image if needed (based on the "ADD" patch URL) or simply use the most recent image from the cache.

There would be one image/Dockerfile that builds libosmo* in it at 'docker build' time, and which then is used with 'docker run' to build the specific application for which you're doing build testing, e.g. osmo-bsc or osmo-hlr.

The build test at 'docker run' time would then happen completely inside the docker tmpfs overlay and is expected to run super quick, particularly given that build-2 has 64GB of RAM. If we want to keep some artefacts, then those would have to be copied (e.g. during "make install" to a bind-mount/volume).

...

Will there be one Dockerfile per each branch or is planned to use docker's "ARG and "--build-arg" to pass branch while building?

In the basic form I would see only one dockerfile+image for libosmo* in master branch that can be used for all osmo-* application build testing.

The given project that you want to build-test would either a) not be part of that dockerfile/image but simply cloned at 'docker run' time. Disadvantage: Must be cloned from scratch.

b) a 'git clone' base version of the (or all?) application-under-test could be part of the Dockerfile/image, so that at 'docker run' time we simply do a git pull + checkout -f -B of the specific branch we want to build. Advantage: no need to clone from scratch at every build, only the delta between 'when image was built last' and the branch/patch-under-test needs to be fetched from the git server

for 'b' all git repos could be cloned into the base Dockerfile/image, or we could have one program-specific Dockerfile/image. In the context of simplicity, I would try to reduce the different number of Dockerfiles/images, and simply have "all-in-one". The age of those "program under test" clones doesn't matter (so no "ADD http://cgit..."), as opposed to the age of the build dependencies, which must be ensured.

...

Furthermore, the nightly package of libosmocore-dev confuses me, especially when thinking about gerrit jobs. How often are these packages updated?

The current Dockerfiles in docker-playground.git are built for executing test software. They re not meant for build testing, please don't confuse those two.

...

Afaiu images will be rebuild if a new patch is introduced. But who is invoking the rebuild when the parent or libosmocore-dev in the example have changed?

image[s] will only need to be rebuilt when a patch to the build dependencies is introduced to the master of such a dependency. They will not be upated by a patch to the project/repo/app-under-test, as that one is not part of the image.

The rebuild of the 'libosmo*' image is triggered by 'docker build' at the beginning of e.g. osmo-bsc-gerrit job.

...

Sharing same layer for "RUN apt-get install ..." command as shown in osmo-nitb-master and osmo-bts-master Dockerfile could be promising.

I think that's not really relevant to build-testing.

...

In general I like the "move" towards docker compared to lxc, which does not provide something similar to a Dockerfile.

Well, it provides templates. Similar, but of course different (and no layer cachce, ...)

...

On the other hand I am skeptic about the whole life cycle, which imo needs some external management as described to keep everything up to date.

no, fully automatic.

...

Additionally, every "docker run ..." command would need a "docker pull ..." before to ensure latest image from repository.

If you do a 'docker build' ahead of every 'docker run', I don't see that need.

...

I will definitely setup some build jobs on my Jenkins with those Docker images to get a better understanding.

Please don't use the Dockerfiles in docker-playground.git. As indicated, they are built for a completely different purpose and hence work differently.

Regards, Harald

-- - Harald Welte laforge@gnumonks.org http://laforge.gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6)

Harald Welte

5:22 p.m.

Hi Andre,

On Wed, Sep 06, 2017 at 03:05:27PM +0200, Harald Welte wrote:

...

In the basic form I would see only one dockerfile+image for libosmo* in master branch that can be used for all osmo-* application build testing.

I've prototyped this at http://git.osmocom.org/docker-playground/tree/osmo-gerrit-libosmo/Dockerfile

If you're trying this, make sure you're running it using 'make run' or manually including the "--tmpfs /tmpfs:exec" arguments.

The VTY tests of course take ages, but the actual build of openbsc.git is pretty fast and executed on /tmpfs.

if the jenkins job for openbsc-gerrit would then do * docker build ... * docker run ...

Then the latest openbsc.git would be built and automatically all upstream dependencies are updated from latest master - but only if the latest master of those dependencies has changed (unlike current osmo-deps.sh).

This is of course all just a prototype, and proper scripts/integration is needed.

André Boddenberg

8:45 p.m.

Hi Harald,

thanks for the clarification! As said, I was wondering about the Dockerfiles, especially about the EXPOSE and CMD command.

The osmo-gerrit-libosmo Docker file is great. A first hands on showed that a openbsc verification build finishes in 5+ minutes and a osmo-msc build in breath taking 18s!!!

All my concerns about the full automation of using latest dependencies AND latest base image are gone.

...

...
This is of course all just a prototype, and proper scripts/integration is needed.

Would it be helpful to set up gerrit verification jobs of mentioned "applications" in the Dockerfile.? I am currently trying to set up a gerrit verification job with YAML DSL (Jenkins Job Builder) and could combine both spikes.

Regards, André

Harald Welte

9:37 p.m.

Hi Andre,

On Wed, Sep 06, 2017 at 08:45:02PM +0200, André Boddenberg wrote:

...

The osmo-gerrit-libosmo Docker file is great. A first hands on showed that a openbsc verification build finishes in 5+ minutes and a osmo-msc build in breath taking 18s!!!

This is good news. Happy you like it.

...

All my concerns about the full automation of using latest dependencies AND latest base image are gone.

Great.

After that much praise, there's also a downside:

* we currently install libosmo* to /usr/local using 'sudo'. There's actually no real reason for this, one could install into some other user-writable PREFIX and use that at compile ('docker run') time. Would be great to see some patches cleaning this up

* we always have all dependencies installed, i.e. we're no longer trying to do builds with certain libraries not present. Let's say we wnat to do an osmo-msc --disable-smpp build: libsmpp34 will still be present, and we *could* introduce unnoticed bugs that would only show up once libsmpp34 is actually not present. One could either not worry too much about it, or one could do some more PKG_CONFIG_PATH hackery with different PREFIX for the different libraries so that each library is in a different PREFIX and the librray is not found unless we explicitly pass the related paths to ./configure.

I'm not sure if it's worth investing too mcuh time into this, given that lots of conditionals go away in the split-nitb scenario: osmo-bsc always implicitly requires "--enable-osmo-bsc" and osmo-sgsn always implicitly requires libgtp. However, there's still the SMPP example.

What do others say? Is this important to test? If so, do we have volunteers to look into writing scripts for this?

* In terms of artefacts, we should figure out which ones we want to keep. For sure any kind of log files like config.log should be copied from the tmpfs to the workspace before we kill the container. They might contain useful information. One *might* also want to do a "make install" to the workspace? So to me config.log is a must, everything else is "optional, later, if somebody needs it". But then, Pau had some other opionion, AFAIR.

...

...
...
This is of course all just a prototype, and proper scripts/integration is needed.

Would it be helpful to set up gerrit verification jobs of mentioned "applications" in the Dockerfile.? I am currently trying to set up a gerrit verification job with YAML DSL (Jenkins Job Builder) and could combine both spikes.

That would be much appreciated. I think the biggest missing part is figuring out some helper scripts to easily 'run' that Docker container with the related arguments. I guess a given jenkins job then should only call that helper script and pass the "configure" arguments and some environment variables like our PARALLEL_MAKE? I wouldn't want to clutter the jenkins job definitions with repetetive long hand-crafted 'docker run' commands.

The next question is then where to store all of this. Given that this helper script as well as the Dockerfile is quite generic, it should probably go into osmo-ci. But then, the Dockerfile depens on stuff from docker-playground. We could merge the two or keep the Dockerfile in docker-playground and just put the helper script in osmo-ci? Actually, the part of the script that's running inside the container could be included in the image at 'docker build' time. Only the 'docker run' wrapper that's used to start a container is external.

In any case, from my point of view a given jenkins gerrit job should do: * git fetch && git checkout -f -B master origin/master on * osmo-ci/docker-playground to make sure we catch any updates to those * 'docker build' of the respective docker image * 'docker run' by means of some helper script, using the respective arguments / build matrix options as required by the given job/project, as well as the exact git commit we want to test-build (instead of master)

The current image should work for {openbsc,osmo-{bsc,bts,pcu,mgw,sgsn,ggsn,trx,hlr},openggsn}-gerrit jobs

For the library projects {libasn1c,libsmpp34,libosmo{core,-abis,-netif,-sccp}} and others which have some downstream build dependencies, we could also use the same docker base image. However, some additional concerns: * when building e.g. libosmocore on a container that already has a system-wide installation of libosmocore, we could accidentially use include files from the system (/usr/local/include), rather than those of the current branch/commit that we're trying to build. One more reason not to install into /usr/local but into specific prefixes (see above) and then tell each given build which of the prefixes to use or not, depending on its build dependencies * we might want to test to build (some of?) the downstream dependencies, as e.g. a commit in libosmocore might break osmo-bts.

Any help / work in the above areas (and anything I might have missed) is much appreciated. I won't have any more time to work on this, too many other topics going on :/

Regards, Harald

Neels Hofmeyr

7 Sep 7 Sep

6:53 a.m.

The one thing I would have done differently:

Why not have a series of Dockerfiles for each git, building onto the Dockerfile of its "next" dependency?

By having all gits in one joint image we "smudge" the build environment, or need to worry about cleaning things up first.

You said you would prefer having not so many images around, but what is the reason? I thought the images were incremental, and if one references the previous, there is only a small difference taking up space for it?

I'm thinking of a cascade like this:

It's a compromise of least dependencies and avoiding dupes, resulting in 13 different stages.

The libosmocore job would update the libosmocore image, the libosmo-abis job builds on it to produce the libosmo-abis image, and so forth. The jenkins jobs would naturally re-use the other jobs' results and be clean every time.

The downside is to have to run ~8 different images for a core network (e.g. on the osmo-gsm-tester). But is docker managing it such that it doesn't take up 8 x the space and RAM, just one debian + 8 little increments?

On the osmo-gsm-tester we can then freely combine various versions of the different core network components by simply running a different image per binary. Would be great to not build separate tester images anymore.

Anyway, that was my first intuition. Maybe one joint image is more practical after all, but harder for me to imagine ATM.

André Boddenberg

3:34 p.m.

Hi Neels,

...

...
The libosmocore job would update the libosmocore image, the libosmo-abis job builds on it to produce the libosmo-abis image, and so forth. The jenkins jobs would naturally re-use the other jobs' results and be clean every time.

Of course this will work, but afaics it won't suite for gerrit verifications. Because docker always takes the latest available base image. So if libosmocore image is currently rebuild and didn't finish yet, a libosmo-abis docker build, which uses libosmocore as base image wouldn't wait until the "new" libosmocore image is built, it would simply use the "old" one. That's why I really like the "one container" solution for gerrit verifications, which checks whether all dependencies are up to date by a "docker build" invocation.

...

...
Would be great to not build separate tester images anymore.

My knowledge about the gsm-tester is quite limited to its manual and I agree with the general idea of reusing things, but it might be better to have Docker images for precise purposes?

My spike with Harald's gerrit-verification image [1] and jenkins-job-builder [2] will probably finish next week (mo/tu).

BR, André

[1] https://git.osmocom.org/docker-playground/tree/osmo-gerrit-libosmo/Dockerfil... [2] https://pypi.python.org/pypi/jenkins-job-builder/

Harald Welte

4:09 p.m.

Hi Neels,

On Thu, Sep 07, 2017 at 06:53:28AM +0200, Neels Hofmeyr wrote:

...

Why not have a series of Dockerfiles for each git, building onto the Dockerfile of its "next" dependency?

To avoid complexity and having too maintain too many Dockerfiles, related images, etc.

I think one of the beauties of the proposal is that we reduce the amount of things that need explicit maintenance.

...

You said you would prefer having not so many images around, but what is the reason? I thought the images were incremental, and if one references the previous, there is only a small difference taking up space for it?

Sure, space-wise it doesn't matter. I'm more thinking of having to maintain the Dockerfiles

...

I'm thinking of a cascade like this:

debian \ libosmocore | libosmo-abis | +---------------- osmo-hlr

this would mean you would have to * docker build the libosmocore image to check/update to current master * docker build the libosmo-abis image * docker run the build for osmo-hlr

If this splits up to even more images, you will end up having something like 8 'docker build' followed by one 'docker run' in each gerrit job. I'm not sure how much of the performance gain we will loose that way.

...

It's a compromise of least dependencies and avoiding dupes, resulting in 13 different stages.

complexity, and manually having to re-trigger builds in the right inverse dependency order in every jenkins job.

...

The libosmocore job would update the libosmocore image, the libosmo-abis job builds on it to produce the libosmo-abis image, and so forth. The jenkins jobs would naturally re-use the other jobs' results and be clean every time.

...

The downside is to have to run ~8 different images for a core network (e.g. on the osmo-gsm-tester). But is docker managing it such that it doesn't take up 8 x the space and RAM, just one debian + 8 little increments?

I cannot comment on memory usage of running lots of images in parallel. As indicated, my proposal was related to gerrit builds, not related to images that can be used to execute the individual osmo-* components.

...

On the osmo-gsm-tester we can then freely combine various versions of the different core network components by simply running a different image per binary. Would be great to not build separate tester images anymore.

The same applies for the TTCN3 tests, for which I build using the current docker-playground. There, each element has a separate image.

Sure, we should aim for overlap where possible and have common ancestors between the Dockerfiles used for gerrit build testing and those for actual per-network-element-containers.

But in the end, I think those are fundamentally different. In terms of how often you build them, and in terms of whether you actually want to keep everything you built vs. building in tmpfs and discarding everything

Neels Hofmeyr

9 Sep 9 Sep

12:33 a.m.

On Thu, Sep 07, 2017 at 03:34:48PM +0200, André Boddenberg wrote:

...

My knowledge about the gsm-tester is quite limited to its manual and I

On the tester, all that we want to do is build (usually current master) and keep the binaries in the image, so that we can launch them with specific config files on another computer. That's all we need to know in this context.

...

Of course this will work, but afaics it won't suite for gerrit verifications. Because docker always takes the latest available base image. So if libosmocore image is currently rebuild and didn't finish yet, a libosmo-abis docker build, which uses libosmocore as base image wouldn't wait until the "new" libosmocore image is built, it would

If a libosmo-abis patch starts building just before the latest merge to libosmo-core master has finished docker building, it doesn't matter much. The libosmo-abis patch usually does not depend on libosmocore work that is just being merged. If it does, the libosmo-abis patch submitter will have to wait for libosmocore to complete. This is the same as our current gerrit patch submission works and "a law of nature". It's expected and not harmful.

On Thu, Sep 07, 2017 at 04:09:26PM +0200, Harald Welte wrote:

...

this would mean you would have to

docker build the libosmocore image to check/update to current master

docker build the libosmo-abis image

docker run the build for osmo-hlr

* I expect the libosmocore-master jenkins job to docker build the libosmocore image whenever a patch was merged. * The libosmo-abis image simply builds on the last stable libosmocore docker image it finds in the hub (what was the generic name for the hub again?). * In turn osmo-hlr takes the last stable libosmo-abis image and just adds building osmo-hlr on top.

Each jenkins job takes exactly one 'FROM' image, builds exactly one git tree, stores exactly one state in the docker cache.

To be precise, the 'master' build jobs would store the built images in the docker hub thing, the gerrit build jobs just take the build rc and discard the image changes (could keep the result in the cache for a short time).

...

If this splits up to even more images, you will end up having something like 8 'docker build' followed by one 'docker run' in each gerrit job. I'm not sure how much of the performance gain we will loose that way.

IIUC, we win tremendously by only 'docker build'ing when something is merged to master. One goal for the osmo-gem-tester is to anyway have docker images ready for each project's current master.

...

manually having to re-trigger builds in the right inverse dependency order in every jenkins job.

I don't see why that is required?

...

To avoid complexity and having too maintain too many Dockerfiles, related images, etc.

I accept that. But I don't have a clear picture in my mind of how it would look in practice with a joint Dockerfile:

So we have one Dockerfile like https://git.osmocom.org/docker-playground/tree/osmo-gerrit-libosmo/Dockerfil... which contains all osmo gits.

This file also actually updates from git and builds *all* the various libs when we update the image.

How does this translate to us wanting to e.g. have one jenkins job verifying libosmocore, one for libosmo-abis, [...], one for osmo-msc?

Each start out with an image where the very project we want to check is already built and installed, so we need to actively remove installed files from /usr/local/{include,lib,bin} first, for only that project under scrutiny. We can't really rely on 'make uninstall' being correct all the time. How about using 'stow' that showed up recently? It allows wiping installs separately, right?

I still see chicken-egg problems: when I run a libosmocore jenkins job, I want to update the image first. *) That inherently already may build libosmocore. For a gerrit patch, it possibly builds libosmocore master, and I can later introduce the patch. If I want to test master though, updating the image already *will* build master, which I actually wanted to test; at what stage then do I detect a failure? If a 'docker build' failure already counts as failure, then: *) What if e.g. libosmo-abis fails during image update: does it cause a failure to be counted for the libosmocore build job instead? I.e. does an unrelated broken libosmo-abis master cross-fire onto non-depending builds? How do we solve this?

And I see a possibility: say for every libosmocore patch we actually also build the whole chain and verify that this new libosmocore works with all depending projects. That way we would detect whether libosmocore breaks builds down the line in other projects, which we don't do on gerrit level yet.

OTOH then we can't easily introduce a change that needs patches in more than one repos; so far we e.g. first change libosmo-sccp (hence temporarily break the build on osmo-msc), then follow right up with a patch on osmo-msc. When we always verify all depending projects as well, we'll never get the first libosmo-sccp patch past the osmo-msc check. To solve we'd need to get both patches in at the same time.

We could parse the commit message 'Depends:' marker, which would actually be interesting to explore: We could have only one single process that is identical for *all* gerrit +V builds across all projects, and wherever a patch is submitted, it always rebuilds and verifies the whole ecosystem from that project on downwards to the projects that depend on it. By docker build arguments we can build with specific git hashes, e.g. one with a new patch, the others with master. A "Depends:" commit msg marker could take in N such git hashes at the same time. Non-trivial to implement, takes more build time (longest for libosmocore, shorter the farther we go towards leaf projects), but concept wise quite appealing.

For the osmo-gsm-tester, it's actually ok to have only one image with all binaries ready in it (and launch it N times if we have to separate networks). Having separate images per each program is helpful to be able to quickly rebuild only one binary to a different git hash for debugging, but there are easy ways to do so also with one joint image.

So... my dream setup is one joint image and one build job for all projects, rebuilding all dependent projects always, using stow to safely clean up for re-building, and automatically building 'Depends:' marked patches together.

But it seems to me to be less trouble to manage N separate Dockerfiles that reference each other and are inherently clean every time. Is there something I'm missing?

3011

Age (days ago)

3014

Last active (days ago)

openbsc@lists.osmocom.org

9 comments

3 participants

tags (0)

participants (3)

André Boddenberg
Harald Welte
Neels Hofmeyr