Ever since I upgraded to debian 9, osmo-hnbgw failed to link in my build:
/usr/bin/ld: hnbgw.o: undefined reference to symbol 'sctp_send@@VERS_1'
I noticed that adding -lsctp to the build fixed that, yet noticed that no build slaves nor anyone else seemed to need that.
Some events coincided my re-install of deb9:
- There has always been an -lsctp in the Makefile.am for osmo-hnbgw, but it was recently dropped in http://git.osmocom.org/osmo-iuh/commit/?id=7235ea02d78299471d58f4202d8c8d685...
- I adopted the jenkins.sh -Werror flags in some builds to get fatal warnings. Passed by './configure CFLAGS=-Werror'. Someone else added this in jenkins.sh and I assumed that was a good way to do things.
- I noticed that I couldn't see debug symbols in gdb, so added 'CLFAGS=-g' in my builds. read on...
Comparing with a docker build kindly provided by laforge, I noticed that a plain 'clone; ./configure; make' does actually succeed on my system! It is a convolution of CFLAGS causing the error.
By passing a 'CFLAGS=foo' to the osmo-iuh configure script, not only do I add 'foo', but I override CFLAGS defaults, and actually *remove* other CFLAGS.
Turns out, whether I pass CFLAGS=-g or not, -g is always part of the build command lines: '-g -O2' is the default. Funnily enough, with CFLAGS=-g passed, I end up removing the -O2 option.
A conclusion here is that in jenkins.sh, we should *not* pass CFLAGS to configure, so that we don't override any default CFLAGS. A configure option like --with-strict-warnings can add to CFLAGS in configure.ac instead
So now I guess because I added CFLAGS=-Werror before, I ended up removing -g. Hence I saw a need to add -g again, goof. Did so in all builds while I was at it, hence -g ended up in osmo-iuh as well.
The -g stuck around, and building osmo-iuh with CFLAGS=-g does, as I already said, effectively remove the -O2 option for all things built in osmo-iuh.
Now I'm confused. Why does a lack of -O2 break a linkage.
It's not the osmo-hnbgw linkage per se that seems to be the cause. I can link that last step with or without -O2 successfully. It must be a combination of building a depending library without -O2 and then linking that to osmo-iuh.
I diffed the entire build log between a succeeding and a failing build of osmo-iuh, and really the only difference everywhere is that the succeeding build uses -O2 everywhere.
The only differences in config.log are the same: most conftest programs have -O2 removed, few have a -g added. The conftest conclusions are all identical.
My big question now is: sctp_send() is actually called from hnbgw.c directly. So IIUC, we should in fact be required to pass -lsctp to the linker for osmo-hnbgw. If that is correct, how am I allowed to skip that by passing -O2?
That's all I figured out so far. I still would like to find out why this can happen, any ideas would be appreciated.
Anyone should be able to reproduce this by building osmo-iuh with ./configure CFLAGS=-g
~N
On Tue, Nov 21, 2017 at 03:06:24AM +0100, Neels Hofmeyr wrote:
Now I'm confused. Why does a lack of -O2 break a linkage.
because it removes dead code.
It's not the osmo-hnbgw linkage per se that seems to be the cause. I can link that last step with or without -O2 successfully. It must be a combination of building a depending library without -O2 and then linking that to osmo-iuh.
I diffed the entire build log between a succeeding and a failing build of osmo-iuh, and really the only difference everywhere is that the succeeding build uses -O2 everywhere.
The only differences in config.log are the same: most conftest programs have -O2 removed, few have a -g added. The conftest conclusions are all identical.
My big question now is: sctp_send() is actually called from hnbgw.c directly.
this is the big surprise here, as this shouldn't be the case ever we introduced libosmo-sigtran.
So IIUC, we should in fact be required to pass -lsctp to the linker for osmo-hnbgw. If that is correct, how am I allowed to skip that by passing -O2?
Because the call to sctp_send() is in dead code, specifically in a static function that the compiler will drop during optimziation. Hence the offending symbol reference is gone at the time of linking if any -O is used.
In I8d52b11e3f476ffd77f3ab185b679817cd3b2163 I introduce the -Wall and then in subseqent Ifbcb21d43e17bf512bc7b219e590410e06c434ca I remove it.
There's also an interesting catch in I9dbad21e75a55ad91b12d3d3ee8bd6dfb5326c3e and plenty of warnings fixes in c7a158211fa59ac27c70faa0813a27e2f3a7a569
On Tue, Nov 21, 2017 at 08:19:06AM +0100, Harald Welte wrote:
In I8d52b11e3f476ffd77f3ab185b679817cd3b2163 I introduce the -Wall and then in subseqent Ifbcb21d43e17bf512bc7b219e590410e06c434ca I remove [the dead function].
yeeees now things are starting to make sense. Thanks for this catch!
There's also an interesting catch in I9dbad21e75a55ad91b12d3d3ee8bd6dfb5326c3e and plenty of warnings fixes in c7a158211fa59ac27c70faa0813a27e2f3a7a569
Can't find c7a158211fa59ac27c70faa0813a27e2f3a7a569, is it I516700eab2aa7c3412dd62775c4960aed9d4b682?
~N