Hi,
Most of the openbsc packages are now up-to-date in Debian! : https://qa.debian.org/developer.php?email=Debian-mobcom-maintainers%40lists....
, but there are a few failing builds on some officially supported Debian architectures (mostly failing test suites):
osmo-bsc: - mips (https://buildd.debian.org/status/fetch.php?pkg=osmo-bsc&arch=mips&ve...) ("gsm0408" and many "handover" tests failing) - s390x (https://buildd.debian.org/status/fetch.php?pkg=osmo-bsc&arch=s390x&v...) ("gsm0408" and many "handover" tests failing)
osmo-iuh: - mips (https://buildd.debian.org/status/fetch.php?pkg=osmo-iuh&arch=mips&ve...) ("helpers" and "ranap" tests failing) - s390x (https://buildd.debian.org/status/fetch.php?pkg=osmo-iuh&arch=s390x&v...) ("helpers" and "ranap" tests failing)
osmo-msc: - mips64el (https://buildd.debian.org/status/fetch.php?pkg=osmo-msc&arch=mips64el&am...) ("msc_vlr_test_gsm_ciph" test failing) - mipsel (https://buildd.debian.org/status/fetch.php?pkg=osmo-msc&arch=mipsel&...) ("msc_vlr_test_gsm_ciph" test failing)
osmo-pcu: - mips (https://buildd.debian.org/status/fetch.php?pkg=osmo-pcu&arch=mips&ve...) (error: #error "Only little endian headers are supported yet. TODO: add missing structs") - s390x (https://buildd.debian.org/status/fetch.php?pkg=osmo-pcu&arch=s390x&v...) (error: #error "Only little endian headers are supported yet. TODO: add missing structs")
Maybe some of you can immediately see what is wrong, and what needs to be fixed?
Best regards Ruben
Hi Ruben,
Most of the openbsc packages are now up-to-date in Debian! : https://qa.debian.org/developer.php?email=Debian-mobcom-maintainers%40lists....
Nice!
I'd like to mention a detail, recently I tried to build osmo-iuh on a freshly installed Debian 9 machine, and it gets a segfault in gcc. Do you hit this when building packages?
Thanks!
~N
On Mon, Nov 12, 2018 at 10:35:28PM +0100, Ruben Undheim wrote:
Hi,
, but there are a few failing builds on some officially supported Debian architectures (mostly failing test suites):
osmo-bsc:
- mips (https://buildd.debian.org/status/fetch.php?pkg=osmo-bsc&arch=mips&ve...) ("gsm0408" and many "handover" tests failing)
I rebuilt repositories matching osmo-bsc version 1.2.1 as indicated here; was a bit hard until I disabled Werror as well as the address sanitizer.
Comparing the output...
Quite early up in the test, a measurement report is simulated, and on mips, it is interpreted as two neighbors, while there should be only one neighbor.
Should be:
- Sending measurement report from mobile #0 (rxlev=30, rxqual=6) * Neighbor cell #0, actual BTS 1 (rxlev=40) DRSL abis_rsl.c:1538 (bts=0,trx=0,ts=1,ss=0): meas_rep_count++=1 meas_rep_last_seen_nr=0 DMEAS abis_rsl.c:1409 [(bts=0,trx=0,ts=1,ss=0)] MEASUREMENT RESULT NR=0 RXL-FULL-ul=-110dBm RXL-SUB-ul=-110dBm RXQ-FULL-ul=0 RXQ-SUB-ul=0 BS_POWER=0 L1_MS_PWR=-22dBm L1_FPC=0 L1_TA=0 RXL-FULL-dl=-80dBm RXL-SUB-dl=-80dBm RXQ-FULL-dl=6 RXQ-SUB-dl=6 NUM_NEIGH=1 DMEAS abis_rsl.c:1442 IDX=0 ARFCN=871 BSIC=63 => -70 dBm DHODEC handover_decision_2.c:1131 (lchan 0.010 TCH/F) (subscr unknown) MEASUREMENT REPORT (1 neighbors) DHODEC handover_decision_2.c:1136 (lchan 0.010 TCH/F) (subscr unknown) 0: arfcn=871 bsic=63 neigh_idx=0 rxlev=40 flags=0 DHODEC handover_decision_2.c:291 (lchan 0.010 TCH/F) (subscr unknown) neigh 871 new in report rxlev=40 last_seen_nr=0
mips gets:
- Sending measurement report from mobile #0 (rxlev=30, rxqual=6) * Neighbor cell #0, actual BTS 1 (rxlev=40) DRSL abis_rsl.c:1538 (bts=0,trx=0,ts=1,ss=0): meas_rep_count++=1 meas_rep_last_seen_nr=0 DMEAS abis_rsl.c:1409 [(bts=0,trx=0,ts=1,ss=0)] MEASUREMENT RESULT NR=0 RXL-FULL-ul=-110dBm RXL-SUB-ul=-110dBm RXQ-FULL-ul=0 RXQ-SUB-ul=0 BS_POWER=0 L1_MS_PWR=-22dBm L1_FPC=0 L1_TA=0 DTXu NOT VALID NUM_NEIGH=2 DMEAS abis_rsl.c:1442 IDX=28 ARFCN=0 BSIC=0 => -77 dBm DMEAS abis_rsl.c:1442 IDX=0 ARFCN=871 BSIC=0 => -96 dBm DHODEC handover_decision_2.c:1131 (lchan 0.010 TCH/F) (subscr unknown) MEASUREMENT REPORT (2 neighbors) DHODEC handover_decision_2.c:1136 (lchan 0.010 TCH/F) (subscr unknown) 0: arfcn=0 bsic=0 neigh_idx=28 rxlev=33 flags=0 DHODEC handover_decision_2.c:1136 (lchan 0.010 TCH/F) (subscr unknown) 1: arfcn=871 bsic=0 neigh_idx=0 rxlev=14 flags=0 DHODEC handover_decision_2.c:291 (lchan 0.010 TCH/F) (subscr unknown) neigh 0 new in report rxlev=33 last_seen_nr=0 DHODEC handover_decision_2.c:291 (lchan 0.010 TCH/F) (subscr unknown) neigh 871 new in report rxlev=14 last_seen_nr=0
On mips the report is misinterpreted with wildly wrong values; actually starts with "DTXu NOT VALID", continues with the bsic and rxlev... I wonder why that happens.
Either handover_test.c encodes it non-portably, or the decoding has a problem.
handover_test.c encodes a measurement report using struct gsm48_meas_res, which has sub-byte ints. It is known that these need #ifdef shims to swap the positions for big-endian vs. little-endian, which this struct does *not* have.
This mips is big endian, right? Then that would be the issue.
Unfortunately I can't test on mips, is there any way that we can easily test a patch on such an arch?
Also, if we fix this on current master, then that doesn't help osmo-bsc v 1.2.1 using libosmocore 0.11.0. We don't have a backporting process at Osmocom. Yet. Maybe it will just be broken until we package more recent versions?
There are numerous other packed structs in libosmocore/include/osmocom/gsm/protocol/gsm_04_08.h with sub-byte ints that lack the BIG_ENDIAN shims.
https://osmocom.org/issues/3693
I don't have a fix yet, let's discuss how a fix would help debian builds as asked above...
And whether we really need to support big endian...?
~N
Hi.
I don't think we have tested this on BE architecture yet. Could you make tickets via https://osmocom.org/projects/osmobsc/issues/new and alike for corresponding projects? At least some of those doesn't look like it could be fixed quickly.
Out of curiosity - do you use real hw to run those builds or it's some sort of vm/emulator?
12.11.18 22:35, Ruben Undheim пишет:
Hi,
Most of the openbsc packages are now up-to-date in Debian! : https://qa.debian.org/developer.php?email=Debian-mobcom-maintainers%40lists....
, but there are a few failing builds on some officially supported Debian architectures (mostly failing test suites):
osmo-bsc:
- mips (https://buildd.debian.org/status/fetch.php?pkg=osmo-bsc&arch=mips&ve...) ("gsm0408" and many "handover" tests failing)
- s390x (https://buildd.debian.org/status/fetch.php?pkg=osmo-bsc&arch=s390x&v...) ("gsm0408" and many "handover" tests failing)
osmo-iuh:
- mips (https://buildd.debian.org/status/fetch.php?pkg=osmo-iuh&arch=mips&ve...) ("helpers" and "ranap" tests failing)
- s390x (https://buildd.debian.org/status/fetch.php?pkg=osmo-iuh&arch=s390x&v...) ("helpers" and "ranap" tests failing)
osmo-msc:
- mips64el (https://buildd.debian.org/status/fetch.php?pkg=osmo-msc&arch=mips64el&am...) ("msc_vlr_test_gsm_ciph" test failing)
- mipsel (https://buildd.debian.org/status/fetch.php?pkg=osmo-msc&arch=mipsel&...) ("msc_vlr_test_gsm_ciph" test failing)
osmo-pcu:
- mips (https://buildd.debian.org/status/fetch.php?pkg=osmo-pcu&arch=mips&ve...) (error: #error "Only little endian headers are supported yet.
TODO: add missing structs")
- s390x (https://buildd.debian.org/status/fetch.php?pkg=osmo-pcu&arch=s390x&v...) (error: #error "Only little endian headers are supported yet.
TODO: add missing structs")
Maybe some of you can immediately see what is wrong, and what needs to be fixed?
Best regards Ruben
Hi,
The best solution would be to build for MIPS or any other Big Endian architecture in our OBS server, so we can catch this kind of issues during nightly build.
Unfortunately, as far as I can see, there's no MIPS architecture support in https://build.opensuse.org.
Hi.
On a related note: I see several patches from Debian devs - for example https://sources.debian.org/patches/libosmo-sccp/0.10.0-2/
Are there any plans to get those upstreamed?
Hi,
as a side note, we are planning to make a new release of most components soon, since there's been lot of fixes and new features since last release, just letting you know.
Regards,
Hi,
I'd like to mention a detail, recently I tried to build osmo-iuh on a freshly installed Debian 9 machine, and it gets a segfault in gcc. Do you hit this when building packages?
segfault in gcc really!? I have not seen it, but I have only built it in Debian 10.
This mips is big endian, right? Then that would be the issue.
Yes, it is probably the main issue. Assuming big-endianness is the problem for all failures on mips and s390x, we are left with the failures for osmo-msc on mips64el and mipsel (the low-endian variants of mips). So the test "msc_vlr_test_gsm_ciph" is failing. Does this tell you anything?
Also, if we fix this on current master, then that doesn't help osmo-bsc v 1.2.1 using libosmocore 0.11.0. We don't have a backporting process at Osmocom. Yet. Maybe it will just be broken until we package more recent versions?
This is unproblematic in the Debian context. As long as we know what the fix is, we can backport the fix wherever we want.
Maybe the trick is to just start going over all structs in libosmocore and upwards, and see if the problem disappears on big-endian architectures. I can start on it, when I find some time for it.
And whether we really need to support big endian...?
Well, nobody has to support big-endian. But building on other archs is a nice test for robustness of the code and the test suite. :)
Unfortunately I can't test on mips, is there any way that we can easily test a patch on such an arch?
If you do not have real hardware or any machines to login to, I think your best option is to try qemu. It should have pretty good support for these things, although it is not as fast as real hardware.
Out of curiosity - do you use real hw to run those builds or it's some sort of vm/emulator?
These are the official Debian build machines, and am pretty sure it is real hw. There are Debian porter boxes I can SSH into also to test on real hardware.
On a related note: I see several patches from Debian devs - for example https://sources.debian.org/patches/libosmo-sccp/0.10.0-2/
Interesting. I did not know about this way of viewing the patches, although I am behind most of the patches. Please just pull back whatever you find useful.
Best regards Ruben
Maybe the trick is to just start going over all structs in libosmocore and upwards, and see if the problem disappears on big-endian architectures. I can start on it, when I find some time for it.
I've started here: https://salsa.debian.org/debian-mobcom-team/libosmocore/blob/master/debian/p...
(to prevent duplicated work)
Ruben
On Wed, Nov 14, 2018 at 07:06:22PM +0100, Ruben Undheim wrote:
segfault in gcc really!? I have not seen it, but I have only built it in Debian 10.
yes, in osmo-iuh.
we are left with the failures for osmo-msc on mips64el and mipsel (the low-endian variants of mips). So the test "msc_vlr_test_gsm_ciph" is failing. Does this tell you anything?
Interesting, the only difference is:
-- ERROR sending ciphering mode command: rc=-95 +- ERROR sending ciphering mode command: rc=-122
i.e. a mismatching rc gets returned. It's not an error with any practical effect.
Aha, could it simply be that the errno are defined differently on this platform? It should be -95 == -ENOTSUP, while we get -122.
Weirdly enough, I don't see this line printed at all in my current msc_vlr_test_gsm_ciph output. Ah, I know, because A5/3 was fixed, hence we see no error anymore, since 3117b701c8d4645215896c459d6c608358a0a51b
There has been no release after that yet.
If you want to stay with that old revision, try branch neels/mipsel of osmo-msc, which I've just pushed, with this patch: http://git.osmocom.org/osmo-msc/commit/?h=neels/mipsel&id=d655d10e98c390...
Or otherwise wait for Pau's release and rather use those latest revisions.
As long as we know what the fix is, we can backport the fix wherever we want.
sure, I forgot the deb packaged patches.
Maybe the trick is to just start going over all structs in libosmocore
I thought about scripting it, because editing manually takes forever and is error prone.
Interesting. I did not know about this way of viewing the patches, although I am behind most of the patches. Please just pull back whatever you find useful.
Ideally, the patch author goes on to submit it on gerrit.osmocom.org :)
~N
Hi,
-- ERROR sending ciphering mode command: rc=-95 +- ERROR sending ciphering mode command: rc=-122
i.e. a mismatching rc gets returned. It's not an error with any practical effect.
Aha, could it simply be that the errno are defined differently on this platform? It should be -95 == -ENOTSUP, while we get -122.
I will have a look. Thanks for the hint. :D
Interesting. I did not know about this way of viewing the patches, although I am behind most of the patches. Please just pull back whatever you find useful.
Ideally, the patch author goes on to submit it on gerrit.osmocom.org :)
Yes, ideally! I have actually done it on gerrit to you guys once before [1]. However, I am interacting with so many different upstream, and everyone has their own way of doing things. Therefore I prefer to just write an email or file an issue, and you can just pick whatever you want. Since you deal with gerrit regularly, it will be much more efficient. (I only have github and salsa in my fingers..) Many patches are not relevant for upstream either. Only when making real bug-fixes (like now for big-endian archs), it makes sense for me to spend time getting them into gerrit myself (and waste time learning how to do it again :D ).
Best regards Ruben
[1] http://lists.osmocom.org/pipermail/openbsc/2016-May/thread.html#9132
-- ERROR sending ciphering mode command: rc=-95 +- ERROR sending ciphering mode command: rc=-122
i.e. a mismatching rc gets returned. It's not an error with any practical effect.
Aha, could it simply be that the errno are defined differently on this platform? It should be -95 == -ENOTSUP, while we get -122.
Correct guess!
In /usr/include/mipsel-linux-gnu/asm/errno.h, EOPNOTSUPP is set to 122. While in /usr/include/asm-generic/errno.h (used on amd64), it is set to 95
I uploaded a fix to Debian: https://salsa.debian.org/debian-mobcom-team/osmo-msc/raw/master/debian/patch...
Ruben
On Thu, Nov 15, 2018 at 04:23:55PM +0100, Neels Hofmeyr wrote:
Maybe the trick is to just start going over all structs in libosmocore
I thought about scripting it, because editing manually takes forever and is error prone.
I actually hacked up one of those borderline-insane py scripts to mangle our C code into big-endian-reversed structs.
See https://gerrit.osmocom.org/#/c/libosmocore/+/11786 for the script and https://gerrit.osmocom.org/#/c/libosmocore/+/11787 for what it did to libosmocore
The script implementation itself is mad and convoluted, of course. How could it be different when handling and mangling C code.
It works by looking at all packed structs that have sub-byte ints, and composes a big-endian counterpart. If there already is a big-endian ifdef, it strips it away and takes the little-endian part as authoritative.
Non-packed structs are ignored, because we sometimes have weird boolean fields in the form of 'int flag:1;' which don't add up to full bytes.
So, we can have auto-generated big-endian structs, and we can also verify in gerrit that they are correct (by runnning the script over the files and erroring if there are any local modifications; not implemented yet).
But we get once-off cosmetic changes, to reach a state where running the struct_endianess.py script over the code results in zero changes.
A simple example of what the script does is
diff --git a/include/osmocom/gprs/protocol/gsm_04_60.h b/include/osmocom/gprs/protocol/gsm_04_60.h index 96e9ab78..5d5fca9a 100644 --- a/include/osmocom/gprs/protocol/gsm_04_60.h +++ b/include/osmocom/gprs/protocol/gsm_04_60.h @@ -7,10 +7,12 @@ #pragma once
#include <stdint.h> +#include <osmocom/core/endian.h>
#if OSMO_IS_LITTLE_ENDIAN == 1 /* TS 04.60 10.3a.4.1.1 */ struct gprs_rlc_ul_header_egprs_1 { +#if OSMO_IS_LITTLE_ENDIAN uint8_t r:1, si:1, cv:4, @@ -26,10 +28,20 @@ struct gprs_rlc_ul_header_egprs_1 { spare_hi:1; uint8_t spare_lo:6, dummy:2; +#elif OSMO_IS_BIG_ENDIAN +/* auto-generated from the little endian part above (libosmocore/contrib/struct_endianess.py) */ + uint8_t tfi_hi:2, cv:4, si:1, r:1; + uint8_t bsn1_hi:5, tfi_lo:3; + uint8_t bsn2_hi:2, bsn1_lo:6; + uint8_t bsn2_lo:8; + uint8_t spare_hi:1, pi:1, rsb:1, cps:5; + uint8_t dummy:2, spare_lo:6; +#endif } __attribute__ ((packed));
A more complex example is:
Currently struct amr_hdr looks like this, manual big-endian stuff:
struct amr_hdr { #if OSMO_IS_BIG_ENDIAN /* Payload Header */ uint8_t cmr:4, /* Codec Mode Request */ pad1:4; /* Table of Contents */ uint8_t f:1, /* followed by another speech frame? */ ft:4, /* coding mode */ q:1, /* OK (not damaged) at origin? */ pad2:2; #elif OSMO_IS_LITTLE_ENDIAN /* Payload Header */ uint8_t pad1:4, cmr:4; /* Table of Contents */ uint8_t pad2:2, q:1, /* OK (not damaged) at origin? */ ft:4, /* coding mode */ f:1; /* followed by another speech frame? */ #endif } __attribute__((packed));
After the script has done its mangling, it will look like this -- the little endian part is unchanged, and the big endian stuff is completely autogenerated:
struct amr_hdr { #if OSMO_IS_LITTLE_ENDIAN /* Payload Header */ uint8_t pad1:4, cmr:4; /* Table of Contents */ uint8_t pad2:2, q:1, /* OK (not damaged) at origin? */ ft:4, /* coding mode */ f:1; /* followed by another speech frame? */ #elif OSMO_IS_BIG_ENDIAN /* auto-generated from the little endian part above (libosmocore/contrib/struct_endianness.py) */ uint8_t cmr:4, pad1:4; uint8_t f:1, ft:4, q:1, pad2:2; #endif } __attribute__((packed));
Notice that little endian is on top now, the big-endian part features a comment mentioning the script, and has different formatting.
I intuitively added the script in libosmocore/contrib/ and not in osmo-ci, ymmv. I still think libosmocore-specific struct mangling belongs in libosmocore, and not in osmo-ci. (The value string verification script is in osmo-ci, but a programmer wanting to auto-generate a big-endian struct before submitting a patch would prefer to have the script there already, without having to download osmo-ci first).
I have pushed a lot of neels/big_endian branches, the libosmocore one is on gerrit already, let's see what the review gets me before submitting the others as well: http://git.osmocom.org/libosmocore/commit/?h=neels/big_endian&id=8670bd0... http://git.osmocom.org/libosmo-netif/commit/?h=neels/big_endian&id=3c210... http://git.osmocom.org/libosmo-sccp/commit/?h=neels/big_endian&id=6470e9... http://git.osmocom.org/libosmo-abis/commit/?h=neels/big_endian&id=ca3e6b... http://git.osmocom.org/osmo-pcu/commit/?h=neels/big_endian&id=7a99dcaeff... http://git.osmocom.org/osmo-sgsn/commit/?h=neels/big_endian&id=f16202863...
~N
I actually hacked up one of those borderline-insane py scripts to mangle our C code into big-endian-reversed structs.
See https://gerrit.osmocom.org/#/c/libosmocore/+/11786 for the script and https://gerrit.osmocom.org/#/c/libosmocore/+/11787 for what it did to libosmocore
The script implementation itself is mad and convoluted, of course. How could it be different when handling and mangling C code.
Awesome, Neels! This can become a very useful tool.
After all structs are fixed, we also have a few places with "ntohs" and its friends to fix.
I've gotten quite a lot of test suites to pass now for several of the packages. See here: https://buildd.debian.org/status/package.php?p=Debian-mobcom-maintainers%40l...
However, in some cases I am unsure if I have fixed the problem or if I have just hidden the problem...
For instance, this one: - https://sources.debian.org/patches/libosmocore/0.12.1-2/0006-Fix-some-byte-o...
It makes the test pass.
Even more hacky, and probably wrong is this one: - https://browse.dgit.debian.org/osmo-bsc.git/commit/?id=656937f42ab08ee206ddf... (it also makes the test pass)
Could you please have a look at these two patches and see if they just hide the problem, and if they do, what the correct fix is?
Best regards Ruben
Cool, this kind of scripts will indeed be useful, specially when used in jenkins.sh
It would be nice to do following check too: * If a struct has BIG_ENDIAN/LITTLE_ENDIAN ifdefs related to bitfields, then verify it contains the "packed" attribute. Why do you care otherwise about bit fields if you are not planning to send over the network? And if you send them over the network you definetly don't want padding.
Hi.
12.11.18 22:35, Ruben Undheim пишет:
Hi,
Most of the openbsc packages are now up-to-date in Debian! : https://qa.debian.org/developer.php?email=Debian-mobcom-maintainers%40lists....
On a somewhat related note, here's how the .deb version compares with other distributions:
https://repology.org/metapackage/libosmocore/versions
Looks pretty good although it's still a long road ahead for Osmocom world domination over majority of distributions :)
Hi Ruben,
thanks for pointing this out.
It has been known to use that Osmocom GSM/3G code doesn't work on big endian systems for quite some time. In the 10 year history of the project, I'm not aware of any user of our software on such systems, or anyone ever having requested support and/or contributed related patches.
As our [paid] development resources are still very limited compared to the vast scope of the 3GPP specs, protocols, network elements, etc. it has never been a priority, including now.
In the Osmocom project we are most happy to review and merge any related fixes! But in terms of sysmocom dedicating paid engineering time to fixing any of those: It's unlikely going to happen, sorry.
I still agree with other responders on this thread that it's useful to document known bugs in our bug tracker, though.
Regards, Harald