I have a reproducable segfault in the SGSN, bisecting to below commit:
(To bisect, I rearranged the commits, see branches neels/sndcp_bisect and
neels/sndcp_bisect_bad in openbsc)
▶ git bisect bad
97991d56800fdc913e6fdf95cac68d598f66b498 is the first bad commit
commit 97991d56800fdc913e6fdf95cac68d598f66b498
Author: Philipp <pmaier(a)sysmocom.de>
Date: Fri Aug 26 17:00:21 2016 +0200
SNDCP: add RFC1144 header compression functionality
- Add module to handle compression entities
- Add module to control header compression
- Introduce VTY commands for heade compression configuration
- Add changes in sndcp and llc to integrate header compression
Change-Id: Ia00260dc09978844c2865957b4d43000b78b5e43
:040000 040000 de76aa0ab5dc11ad81666f9f4c933544eedcd4f1
c199c47cd19b5a1a334020a598dba3e0be922fe5 M openbsc
Reproduce: Have a 3G UE try to establish a PDP context. (basically just let it
subscribe to the network with mobile data enabled.) Yes, this is on the
sysmocom/iu branch.
Note: the N-DATA.ind shows the correct hexdump and data length, but the
backtrace shows that two lines below that a function is passed the values
ctx=0x6cf9d0, data=0x0, len=140737316187968) at ranap_common_cn.c:310
So it looks like a stack corruption caused somewhere completely different. It's
very reproducable though, even after rearranging things with other library
states I always get the exact same place segfaulting:
20160927195152039 <001a> iu.c:755 N-DATA.ind(0, 60 00 00 1d 00 00 01 00 34 40 16 00
00 01 00 33 40 0f 60 28 dc 35 00 01 0a 09 01 0b 00 00 00 00 01 , len 33)
20160927195152039 <001a> iu.c:757 1msg 0x6d07a8 len 33
20160927195152039 <001a> iu.c:767 2msg 0x6d07a8 len 33
<RANAP_RAB-SetupOrModifiedList>
<raB-SetupOrModifiedList-ies>
<RANAP_IE>
<id>51</id>
<criticality><ignore/></criticality>
<value>60 28 DC 35 00 01 0A 09 01 0B 00 00 00 00 01</value>
</RANAP_IE>
</raB-SetupOrModifiedList-ies>
</RANAP_RAB-SetupOrModifiedList>
20160927195152039 <001a> iu.c:460 handle_co(dir=4, proc=0)
20160927195152039 <001a> iu.c:433 RAB Asignment
Response:<RANAP_RAB-SetupOrModifiedItem>
<rAB-ID>
00000101
</rAB-ID>
<transportLayerAddress>
00110101000000000000000100001010000010010000000100001011
</transportLayerAddress>
<iuTransportAssociation>
<gTP-TEI>00 00 00 01</gTP-TEI>
</iuTransportAssociation>
</RANAP_RAB-SetupOrModifiedItem>
Setup: (5/35 00 01 0a 09 01 0b )20160927195152039 <001a> sgsn_libgtp.c:483 Updating
TEID on RNC side from 0x00000001 to 0x00000001
20160927195152039 <000f> gprs_gmm.c:2031 PDP(901990000000038/0) <- ACTIVATE PDP
CONTEXT ACK
20160927195152039 <001a> iu.c:398 Transmitting L3 Message as RANAP DT (SUA link
0x6ce280 conn_id 0)
<RANAP_IE>
<id>16</id>
<criticality><ignore/></criticality>
<value>
31 8A 42 03 0E 23 62 1F 72 99 3F 3F 11 43 FF FF
00 00 00 00 2B 06 01 21 0A 17 2A 02 27 14 80 80
21 10 02 00 00 10 81 06 08 08 08 08 83 06 00 00
00 00
</value>
</RANAP_IE>
<RANAP_IE>
<id>59</id>
<criticality><ignore/></criticality>
<value>00</value>
</RANAP_IE>
20160927195152039 <001b> sua.c:591 Received SCCP User Primitive (N-DATArequest)
20160927195152039 <001b> sua.c:245 sua_link_send(01 00 08 08 00 00 00 60 00 06 00 08
00 00 00 00 01 05 00 08 00 00 03 e8 01 0b 00 46 00 14 00 3e 00 00 02 00 10 40 32 31 8a 42
03 0e 23 62 1f 72 99 3f 3f 11 43 ff ff 00 00 00 00 2b 06 01 21 0a 17 2a 02 27 14 80 80 21
10 02 00 00 10 81 06 08 08 08 08 83 06 00 00 00 00 00 3b 40 01 00 00 00 )
Program received signal SIGSEGV, Segmentation fault.
sndcp_sn_xid_req (lle=0x340, nsapi=5 '\005') at gprs_sndcp.c:926
926 gprs_sndcp_comp_free(lle->llme->comp.proto);
(gdb) bt
#0 sndcp_sn_xid_req (lle=0x340, nsapi=5 '\005') at gprs_sndcp.c:926
#1 0x0000000000415681 in send_act_pdp_cont_acc (pctx=0x6d0290)
at sgsn_libgtp.c:346
#2 0x0000000000416ba6 in sgsn_ranap_rab_ass_resp (ctx=0x340, setup_ies=0x0)
at sgsn_libgtp.c:492
#3 0x0000000000422e28 in ranap_handle_co_rab_ass_resp (ies=<optimized out>,
ies=<optimized out>, ctx=<optimized out>) at iu.c:445
#4 cn_ranap_handle_co (ctx=0x6cf9d0, message=0x0) at iu.c:510
#5 0x00007ffff5909f60 in ranap_cn_rx_co (cb=0x422890 <cn_ranap_handle_co>,
ctx=0x6cf9d0, data=0x0, len=140737316187968) at ranap_common_cn.c:310
#6 0x0000000000423d1c in sccp_sap_up (oph=0x6d06a8, link=0x6ce280) at iu.c:768
#7 0x00007ffff5be6c7f in sua_rx_codt (xua=<optimized out>,
link=<optimized out>) at sua.c:1164
#8 sua_rx_co (msg=<optimized out>, xua=<optimized out>, link=<optimized
out>)
at sua.c:1196
#9 sua_rx_msg (link=0x0, msg=0x5) at sua.c:1232
#10 0x00007ffff5be7042 in sua_srv_conn_cb (conn=0x6cf300) at sua.c:1360
#11 0x00007ffff4c4c881 in osmo_stream_srv_read (conn=0x6ce1b0) at stream.c:512
#12 osmo_stream_srv_cb (ofd=<optimized out>, what=1) at stream.c:563
#13 0x00007ffff79bab82 in osmo_fd_disp_fds (_eset=0x7fffffffe360,
_wset=0x7fffffffe2e0, _rset=0x7fffffffe260) at select.c:149
#14 osmo_select_main (polling=0) at select.c:191
#15 0x0000000000404ee2 in main (argc=3, argv=0x0) at sgsn_main.c:448
(gdb)
How do we do this? Revert Philipps commits for now?
I need to go now, but next I'll take a short look whether I can reproduce with
other hardware / spot a bug anywhere. No idea yet whether it only happens for
3G.
~Neels
--
- Neels Hofmeyr <nhofmeyr(a)sysmocom.de>
http://www.sysmocom.de/
=======================================================================
* sysmocom - systems for mobile communications GmbH
* Alt-Moabit 93
* 10559 Berlin, Germany
* Sitz / Registered office: Berlin, HRB 134158 B
* Geschäftsführer / Managing Directors: Harald Welte