Hi all,
I have a Osmocom setup running with a Ettus B200
I noticed that since several weeks that osmo-pcu did not work as good as previously.
The latest "good" revision I am aware of is commit d87e1d6ab747423d3668c74d16201a5d967accf0 (2015/12/14)
I tried again today with commit 2fcfc29020c81891d7888ddc7ddbcd866bcd406d (2016/05/24) and not a single handset could establish data communication
Reading osmo-pcu logs show that T6169 timeouts during TBF assignment :
Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:479 MS requests UL TBF on RACH, so we provide one: Fri May 27 15:51:38 2016 DRLCMAC <0002> tbf.cpp:672 ********** TBF starts here ********** Fri May 27 15:51:38 2016 DRLCMAC <0002> tbf.cpp:674 Allocating UL TBF: MS_CLASS=0/0 Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_ms.cpp:114 Creating MS object, TLLI = 0x00000000 Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:407 Searching for first unallocated TFI: TRX=0 Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:417 Found TFI=0. Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:525 Slot Allocation (Algorithm B) for unknown class (assuming 12) Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:560 - Rx=4 Tx=4 Sum Rx+Tx=5 Tta=2 Ttb=1 Tra=2 Trb=1 Type=1 Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:157 - Skipping TS 0, because not enabled Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:579 - Possible DL/UL slots: (TS=0)".CCCCCCC"(TS=7) Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:940 - Selected UL slots: (TS=0)"...U...."(TS=7), single Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:966 Using single slot at TS 3 for UL Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:990 - Reserved DL/UL slots: (TS=0)"...C...."(TS=7) Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:1017 - Assigning UL TS 3 Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:1481 PDCH(TS 3, TRX 0): Attaching TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL), 1 TBFs, USFs = 01, TFIs = 00000001. Fri May 27 15:51:38 2016 DRLCMAC <0002> tbf.cpp:385 - Setting Control TS 3 Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_ms.cpp:267 Attaching TBF to MS object, TLLI = 0x00000000, TBF = TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL) Fri May 27 15:51:38 2016 DRLCMAC <0002> tbf.cpp:625 Allocated TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL): trx = 0, ul_slots = 08, dl_slots = 00 Fri May 27 15:51:38 2016 DRLCMAC <0002> ./tbf.h:291 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL) changes state from NULL to FLOW Fri May 27 15:51:38 2016 DRLCMAC <0002> tbf.cpp:409 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) starting timer 3169. Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:523 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) [UPLINK] START Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:527 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) RX: [PCU <- BTS] RACH qbit-ta=0 ra=0x7b, Fn=1607859 (28,33,19) Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:529 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) TX: START Immediate Assignment Uplink (AGCH) Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:542 - TRX=0 (128) TS=3 TA=0 TSC=3 TFI=0 USF=0 Fri May 27 15:51:38 2016 DRLCMAC <0002> ./tbf.h:291 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) changes state from FLOW to WAIT ASSIGN Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:819 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=WAIT ASSIGN) timer 3169 expired. Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:874 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=WAIT ASSIGN) T3169 timeout during transsmission Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:893 - Assignment was on CCCH Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:899 - No uplink data received yet Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:879 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=WAIT ASSIGN) will be freed due to timeout Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:334 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=WAIT ASSIGN) free
Every element of the stack is running with today's revision on master (cheers for making it happen without specific branches by the way).
With the same setup with osmo-pcu@d87e1d6ab747423d3668c74d16201a5d967accf0, I get solid data rates around 40 Kbits/s.
I haven't seen anybody on the mailing list experiencing the same behavior, making me believe that it could be specific to osmo-bts-trx / osmo-trx users.
Can somebody confirm or reproduce this attitude ?
I will try to provide more logs and a pcap capture as soon as I get hands back on the hardware.
On 27 May 2016, at 18:02, Pierre Baudry pierre.baudry@diateam.net wrote:
Hi!
The latest "good" revision I am aware of is commit d87e1d6ab747423d3668c74d16201a5d967accf0 (2015/12/14)
I tried again today with commit 2fcfc29020c81891d7888ddc7ddbcd866bcd406d (2016/05/24) and not a single handset could establish data communication
Can somebody confirm or reproduce this attitude ?
I will try to provide more logs and a pcap capture as soon as I get hands back on the hardware.
Can you bisect it?
git bisect start git bisect bad 2fcfc29020c81891d7888ddc7ddbcd866bcd406d git bisect good d87e1d6ab747423d3668c74d16201a5d967accf0 Bisecting: 77 revisions left to test after this (roughly 6 steps)
(separate commits should compile, so the above should be feasible). The commit range you have includes the move to EDGE that has changed GPRS code as well.
holger
On 27/05/2016 20:17, Holger Freyther wrote:
Can you bisect it?
git bisect start git bisect bad 2fcfc29020c81891d7888ddc7ddbcd866bcd406d git bisect good d87e1d6ab747423d3668c74d16201a5d967accf0 Bisecting: 77 revisions left to test after this (roughly 6 steps)
(separate commits should compile, so the above should be feasible). The commit range you have includes the move to EDGE that has changed GPRS code as well.
holger
Sure, here is the bisect log :
git bisect start # bad: [2fcfc29020c81891d7888ddc7ddbcd866bcd406d] add comments to describe functions git bisect bad 2fcfc29020c81891d7888ddc7ddbcd866bcd406d # good: [d87e1d6ab747423d3668c74d16201a5d967accf0] rlc: Do not raise_v_q in receive_bsn git bisect good d87e1d6ab747423d3668c74d16201a5d967accf0 # good: [0df80be95e3604656ff36024f793ef3c36455051] rlc: Add decode_gsm_ra_cap to decode Radio Access Caps git bisect good 0df80be95e3604656ff36024f793ef3c36455051 # bad: [7f28c97fcc87d2ce773a2ae91579a84b40d12539] edge: Support all coding schemes for BSSGP flow control git bisect bad 7f28c97fcc87d2ce773a2ae91579a84b40d12539 # good: [18831c3ca94b8cfcdb64a883d88d22404d7c28bd] encoding: Refactor write_immediate_assignment git bisect good 18831c3ca94b8cfcdb64a883d88d22404d7c28bd # bad: [215e18c9d45cdaa43705ae7c8f8cb43c0db28225] cs: Add GprsCodingScheme::optionalPaddingBits git bisect bad 215e18c9d45cdaa43705ae7c8f8cb43c0db28225 # good: [7c72acaa941fd7f3663b0f9b36fe30f4974f1979] ms: Add current_pacch_slots method git bisect good 7c72acaa941fd7f3663b0f9b36fe30f4974f1979 # bad: [9e8593917f3e301c3487f73430ce416a08360ce8] rlc: Support encoding of EGPRS header type 1 + 2 git bisect bad 9e8593917f3e301c3487f73430ce416a08360ce8 # bad: [f1a7b8fc6651f92a8b7f3f27b7ca05d07f4e44e0] tbf: Add state WAIT_ASSIGN git bisect bad f1a7b8fc6651f92a8b7f3f27b7ca05d07f4e44e0 # first bad commit: [f1a7b8fc6651f92a8b7f3f27b7ca05d07f4e44e0] tbf: Add state WAIT_ASSIGN
I experienced other service disruptions while bisecting this, but let's address this specific bug first :)
On 30 May 2016, at 10:49, Pierre Baudry pierre.baudry@diateam.net wrote:
# first bad commit: [f1a7b8fc6651f92a8b7f3f27b7ca05d07f4e44e0] tbf: Add state WAIT_ASSIGN
I experienced other service disruptions while bisecting this, but let's address this specific bug first :)
argh. Could you go to f1a7b8fc6651f92a8b7f3f27b7ca05d07f4e44e0 and revert it and then try again? Does it work better?
holger
On 30/05/2016 21:34, Holger Freyther wrote:
argh. Could you go to f1a7b8fc6651f92a8b7f3f27b7ca05d07f4e44e0 and revert it and then try again? Does it work better?
holger
This specific problem is gone (I can see data flows and even some packets) but data service is still practically unusable, I get a lot of "Scheduling Ack/Nack, because MS is stalled." messages.
On 02/06/2016 10:47, Max wrote:
Hi.
Could you please make a ticket in http://projects.osmocom.org/ to make sure this won't get lost in ML archives?
Yes, I just wanted to make sure that I wasn't the only one experiencing this problem.
I am now bisecting d87e1d6ab747423d3668c74d16201a5d967accf0..f1a7b8fc6651f92a8b7f3f27b7ca05d07f4e44e0 commits to identify which commit introduced these reliability problems
Dear Pierre,
I am interested in this topic, I am abroad at the moment, but in a weeks time I will be able to test it.
To point out a similar case, I had almost the same issue with OpenBTS and its "own" transceiver, which was also created by Tom Tsou but it was not the Osmo-TRX.
The behavior was pretty much the same: after the IMM_ASS the UE is sending the Packet Resource Request, but the BTS never receives it and the T3168 timeout happens. It was kinda hard to reproduce it, and we never found the root cause.There was some suspition that the USRP SDR is somehow misses to detect the PRR, as nothing was sent to the upper layers. Here you can find detailed protocol traces (wireshark and QXDM) and description of what we found:
https://github.com/RangeNetworks/openbts/issues/8
Regards, Csaba
----- Eredeti üzenet ----- Feladó: "Pierre Baudry" pierre.baudry@diateam.net Címzett: openbsc@lists.osmocom.org Elküldött üzenetek: Péntek, 2016. Május 27. 18:02:28 Tárgy: Regression in osmo-pcu with UHD trx ?
Hi all,
I have a Osmocom setup running with a Ettus B200
I noticed that since several weeks that osmo-pcu did not work as good as previously.
The latest "good" revision I am aware of is commit d87e1d6ab747423d3668c74d16201a5d967accf0 (2015/12/14)
I tried again today with commit 2fcfc29020c81891d7888ddc7ddbcd866bcd406d (2016/05/24) and not a single handset could establish data communication
Reading osmo-pcu logs show that T6169 timeouts during TBF assignment :
Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:479 MS requests UL TBF on RACH, so we provide one: Fri May 27 15:51:38 2016 DRLCMAC <0002> tbf.cpp:672 ********** TBF starts here ********** Fri May 27 15:51:38 2016 DRLCMAC <0002> tbf.cpp:674 Allocating UL TBF: MS_CLASS=0/0 Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_ms.cpp:114 Creating MS object, TLLI = 0x00000000 Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:407 Searching for first unallocated TFI: TRX=0 Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:417 Found TFI=0. Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:525 Slot Allocation (Algorithm B) for unknown class (assuming 12) Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:560 - Rx=4 Tx=4 Sum Rx+Tx=5 Tta=2 Ttb=1 Tra=2 Trb=1 Type=1 Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:157 - Skipping TS 0, because not enabled Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:579 - Possible DL/UL slots: (TS=0)".CCCCCCC"(TS=7) Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:940 - Selected UL slots: (TS=0)"...U...."(TS=7), single Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:966 Using single slot at TS 3 for UL Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:990 - Reserved DL/UL slots: (TS=0)"...C...."(TS=7) Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_rlcmac_ts_alloc.cpp:1017 - Assigning UL TS 3 Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:1481 PDCH(TS 3, TRX 0): Attaching TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL), 1 TBFs, USFs = 01, TFIs = 00000001. Fri May 27 15:51:38 2016 DRLCMAC <0002> tbf.cpp:385 - Setting Control TS 3 Fri May 27 15:51:38 2016 DRLCMAC <0002> gprs_ms.cpp:267 Attaching TBF to MS object, TLLI = 0x00000000, TBF = TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL) Fri May 27 15:51:38 2016 DRLCMAC <0002> tbf.cpp:625 Allocated TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL): trx = 0, ul_slots = 08, dl_slots = 00 Fri May 27 15:51:38 2016 DRLCMAC <0002> ./tbf.h:291 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL) changes state from NULL to FLOW Fri May 27 15:51:38 2016 DRLCMAC <0002> tbf.cpp:409 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) starting timer 3169. Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:523 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) [UPLINK] START Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:527 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) RX: [PCU <- BTS] RACH qbit-ta=0 ra=0x7b, Fn=1607859 (28,33,19) Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:529 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) TX: START Immediate Assignment Uplink (AGCH) Fri May 27 15:51:38 2016 DRLCMAC <0002> bts.cpp:542 - TRX=0 (128) TS=3 TA=0 TSC=3 TFI=0 USF=0 Fri May 27 15:51:38 2016 DRLCMAC <0002> ./tbf.h:291 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) changes state from FLOW to WAIT ASSIGN Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:819 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=WAIT ASSIGN) timer 3169 expired. Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:874 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=WAIT ASSIGN) T3169 timeout during transsmission Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:893 - Assignment was on CCCH Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:899 - No uplink data received yet Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:879 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=WAIT ASSIGN) will be freed due to timeout Fri May 27 15:51:43 2016 DRLCMAC <0002> tbf.cpp:334 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=WAIT ASSIGN) free
Every element of the stack is running with today's revision on master (cheers for making it happen without specific branches by the way).
With the same setup with osmo-pcu@d87e1d6ab747423d3668c74d16201a5d967accf0, I get solid data rates around 40 Kbits/s.
I haven't seen anybody on the mailing list experiencing the same behavior, making me believe that it could be specific to osmo-bts-trx / osmo-trx users.
Can somebody confirm or reproduce this attitude ?
I will try to provide more logs and a pcap capture as soon as I get hands back on the hardware.
Hi.
Could you please make a ticket in http://projects.osmocom.org/ to make sure this won't get lost in ML archives?