osmo-pcu tbf test asserts, all pcu dev blocked -- was: tbf test segfault

Neels Hofmeyr nhofmeyr at sysmocom.de
Thu Jan 12 12:23:46 UTC 2017

On Thu, Jan 12, 2017 at 11:56:32AM +0100, Max wrote:
> Subject: Re: tbf test segfault

It's a SIGABRT, not a segfault. SIGABRT is the intended result of a failed

> Anyone else seeing this?

Yes, I can reproduce the same.

> That's odd cause it should have been caught by jenkins way before.

It is possible that two patches on gerrit pass on their own, but when merged
the combination of them causes a fault. That is due to the
"rebase-if-necessary" policy we're using on gerrit, not enforcing another build
verification when the master has moved on. In that case we can look at the
master branch verification build on jenkins.osmosom.org (the one without
"gerrit" in its name).

The first failure in our master verification job is
suggesting that the causing commit was

	commit b3df58660f6e965799b18b5b87892a3272c4ccf1
	Author:     Max <msuraev at sysmocom.de>
	    Log socket path on connection

which doesn't make sense to me, because that is a log message tweak.
Could the local.sun_path somehow cause stack corruption?? :

	--- a/src/osmobts_sock.cpp
	+++ b/src/osmobts_sock.cpp
	@@ -282,7 +282,8 @@ int pcu_l1if_open(void)
			return rc;

	-       LOGP(DL1IF, LOGL_NOTICE, "osmo-bts PCU socket has been connected\n");
	+       LOGP(DL1IF, LOGL_NOTICE, "osmo-bts PCU socket %s has been connected\n",
	+            local.sun_path);

		pcu_sock_state = state;

I suspect some other hidden issue that coincidentally shows its effect only
after this commit. Someone (TM) should fire asan and valgrind on it.

It appears osmo-pcu devel is now blocked until this issue is fixed.


