On 13. Sep 2017, at 07:41, Neels Hofmeyr nhofmeyr@sysmocom.de wrote:
Hi!
On Mon, Sep 11, 2017 at 09:09:06PM +0200, Harald Welte wrote:
In https://gerrit.osmocom.org/3899 which has failed in https://jenkins.osmocom.org/jenkins/job/OpenBSC-gerrit/2451/ and https://jenkins.osmocom.org/jenkins/job/OpenBSC-gerrit/2454/
This particular failure is due to a VTY change in libosmocore. I have fixed it in osmo-bsc.git, and this needs to be applied to openbsc.git as well. Change-Id: I77931d6a09c42c443c6936000592f22a7fd06cab
Great. So the VTY tests found a behavior change and did its job. I think disabling tests is a slippery slope. Let's assume we would run it daily and send emails. How likely would it be that n-failures in a row trigger a question to disable the mail notifications?
We do run into some form of resource limitation and mitigated by reducing the number of executors (but that is up again). In the past the VTY test runner forgot to close sockets but we were still running into something.
So either a form of kernel limit (and I couldn't find a MIB counting it) or something caused by "slow" (as recently pointed out) disk leading to a slow start of the software under test?
I know Neels and others have spend already significant time in the past trying to resolve this - unsuccessfully.
That was the testBSCreload running into "Broken Pipe" errors.
If you see more of those, we may want to disable the testBSCreload:
broken pipe still sounds like either we kill the TCP connection before we want to or the remote process terminated. Could we dump core and check if these exist at the end of the test run (and check that dumping core in a container works).
holger