From hfreyther at sysmocom.de Fri Dec 6 09:26:02 2013 From: hfreyther at sysmocom.de (Holger Hans Peter Freyther) Date: Fri, 6 Dec 2013 10:26:02 +0100 Subject: Current list of failures in the pcu Message-ID: <20131206092602.GL31478@xiaoyu.lan> Good Morning, we at sysmocom continued to add structure/architecture to the PCU code and started to systematically test the code (using something as simple as ICMP ping/reply) Observations: * "alloc algorithm b" has little (3kbit/s) more bandwidth than just using the single slot assignment with CS4. There are also some open coverity issues about this multi slot code. * Congestion is badly handled. When using a ping that creates IP fragments the LLC queue of the TBF gets so long that we expire half the IP fragments before sending them. This means the MS at the downlink never receives all the IP fragments of a PING. We played with the BSVC-FLOW-CONTROL parameters but they don't have any influence here. We have added counting of queue size and calculation of the average queue delay to the LLC class. One option is to drop packages based on the PDU lifetime and the current queue delay. $ sudo ping -s 1800 -i 0.1 10.23.42.6 (CS4, alloc-algorithm a) is enough to re-produce this issue. * RLC V(Q) V(R) handling. It appears that after and before the refactoring the Received Block Bitmap (RBB) is wrongly encoded. The encode and decode are not compatible to each other (thanks to the architecture sysmocom is adding to the code this can now be unit tested!). So as soon as we don't receive a frame from the UL. The send window at the phone and the PCU is going out of sync. After making changes to the window we have also witnesses that on a fully received window V(Q) and V(R) were WindowSize away from each other. <0005> tbf.cpp:1579 UL DATA TFI=0 received (V(Q)=55 .. V(R)=60) <0005> tbf.cpp:1624 - BSN 60 storing in window (55..118) <0005> rlc.cpp:173 - Raising V(R) to 61 <0005> tbf.cpp:1690 - Scheduling Ack/Nack, because 20 frames received. <0005> encoding.cpp:374 Encoding Ack/Nack for TBF(TFI=0 TLLI=0xfa45049d DIR=UL) (final=0) SSN=61 <0005> encoding.cpp:408 - V(N): "RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR" R=Received N=Not-Received This part might be something we broke but we will try to re-produce with a unit test. We will continue to work on unit-testing the RLC window handling if anyone outside sysmocom interested in getting a more stable PCU? holger -- - Holger Freyther http://www.sysmocom.de/ ======================================================================= * sysmocom - systems for mobile communications GmbH * Schivelbeiner Str. 5 * 10439 Berlin, Germany * Sitz / Registered office: Berlin, HRB 134158 B * Geschaeftsfuehrer / Managing Directors: Holger Freyther, Harald Welte From andreas at eversberg.eu Fri Dec 6 12:35:09 2013 From: andreas at eversberg.eu (Andreas Eversberg) Date: Fri, 06 Dec 2013 13:35:09 +0100 Subject: Current list of failures in the pcu In-Reply-To: <20131206092602.GL31478@xiaoyu.lan> References: <20131206092602.GL31478@xiaoyu.lan> Message-ID: <52A1C47D.4010202@eversberg.eu> > > * RLC V(Q) V(R) handling. It appears that after and before the > refactoring the Received Block Bitmap (RBB) is wrongly encoded. The > encode and decode are not compatible to each other (thanks to the > architecture sysmocom is adding to the code this can now be unit > tested!). > dear holger, is it already fixed? how can i test it? running "make check" does not cause any failure. (i use zecke/features/clean-up branch) > So as soon as we don't receive a frame from the UL. The send window > at the phone and the PCU is going out of sync. > > After making changes to the window we have also witnesses that on > a fully received window V(Q) and V(R) were WindowSize away from > each other. > > <0005> tbf.cpp:1579 UL DATA TFI=0 received (V(Q)=55 .. V(R)=60) > this means that our receive array contains blocks from 55 to 59... > <0005> tbf.cpp:1624 - BSN 60 storing in window (55..118) > <0005> rlc.cpp:173 - Raising V(R) to 61 > as we received block 60, our receive array contains blocks from 55 to 60 now. in your code i can see that "gprs_rlc_ul_window::raise_v_q(gprs_rlc_v_n *v_n)" is called. since there is no debugging that shows something like "- Taking block 55 out, raising V(Q) to 56", i assume that the if-condition "if (!v_n->is_received(v_q()))" is positive. this means that v_n.state(55) is not 'R'... > <0005> tbf.cpp:1690 - Scheduling Ack/Nack, because 20 frames received. > <0005> encoding.cpp:374 Encoding Ack/Nack for TBF(TFI=0 TLLI=0xfa45049d DIR=UL) (final=0) SSN=61 > <0005> encoding.cpp:408 - V(N): "RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR" R=Received N=Not-Received > so i would expect that there is no 'R' at position last but 5 in this debug string, but there is one. is there a way to reproduce this? you said "So as soon as we don't receive a frame from the UL. The send window at the phone and the PCU is going out of sync.". does this mean that it can be reproduced by dropping one frame on the uplink? (e.g. dropping frame with BSN 55 once.) best regards, andreas From hfreyther at sysmocom.de Fri Dec 6 13:20:00 2013 From: hfreyther at sysmocom.de (Holger Hans Peter Freyther) Date: Fri, 6 Dec 2013 14:20:00 +0100 Subject: Current list of failures in the pcu In-Reply-To: <52A1C47D.4010202@eversberg.eu> References: <20131206092602.GL31478@xiaoyu.lan> <52A1C47D.4010202@eversberg.eu> Message-ID: <20131206132000.GC24984@xiaoyu.lan> On Fri, Dec 06, 2013 at 01:35:09PM +0100, Andreas Eversberg wrote: > is it already fixed? how can i test it? running "make check" does not > cause any failure. (i use zecke/features/clean-up branch) This is outdated. I will update sysmocom/master and sysmocom/clean-up today. > is there a way to reproduce this? you said "So as soon as we don't > receive a frame from the UL. The send window at the phone and the PCU is > going out of sync.". does this mean that it can be reproduced by > dropping one frame on the uplink? (e.g. dropping frame with BSN 55 once.) exactly. Dropping UL frames should be enough to re-produce it. But have a look at tests/types/TypesTest.cpp. The nice part of separating the spaghetti into classes/modules is that they can be unit tested. So we should be able to easily re-produce the issue with a testcase now. holger -- - Holger Freyther http://www.sysmocom.de/ ======================================================================= * sysmocom - systems for mobile communications GmbH * Schivelbeiner Str. 5 * 10439 Berlin, Germany * Sitz / Registered office: Berlin, HRB 134158 B * Geschaeftsfuehrer / Managing Directors: Holger Freyther, Harald Welte From hfreyther at sysmocom.de Wed Dec 18 12:48:52 2013 From: hfreyther at sysmocom.de (Holger Hans Peter Freyther) Date: Wed, 18 Dec 2013 13:48:52 +0100 Subject: Status of sysmocom/master W50 Message-ID: <20131218124852.GA13933@xiaoyu.lan> Good Afternoon, last week at the sysmocom office Daniel and me continued to work on the window handling. We have now a proper abstraction that allows us to unit test the DL and UL window handling. The encode/decode of the RBB is now compatible to itself. While doing our manual tests (sending a big enough ICMP echo request) we noticed that in the alloc algorithm b case the we get a lot of NACKs. In the same setup (only alloc algorithm is changed) the A algorithm has no NACKs. I am now going to re-write the allocation handling. In terms of operation on the first DL assignment (through the PCH) we can only use a singleslot but when re-using the TBF we might go for multislot. In case the of UL handling we could always try to use multislot assignments. Is this understanding correct? holger -- - Holger Freyther http://www.sysmocom.de/ ======================================================================= * sysmocom - systems for mobile communications GmbH * Schivelbeiner Str. 5 * 10439 Berlin, Germany * Sitz / Registered office: Berlin, HRB 134158 B * Geschaeftsfuehrer / Managing Directors: Holger Freyther, Harald Welte From andreas at eversberg.eu Wed Dec 18 14:52:28 2013 From: andreas at eversberg.eu (Andreas Eversberg) Date: Wed, 18 Dec 2013 15:52:28 +0100 Subject: Status of sysmocom/master W50 In-Reply-To: <20131218124852.GA13933@xiaoyu.lan> References: <20131218124852.GA13933@xiaoyu.lan> Message-ID: <52B1B6AC.6080903@eversberg.eu> Holger Hans Peter Freyther wrote: > I am now going to re-write the allocation handling. In terms of operation > on the first DL assignment (through the PCH) we can only use a singleslot > but when re-using the TBF we might go for multislot. In case the of UL > handling we could always try to use multislot assignments. Is this > understanding correct? hi holger, when i designed multislot allocation algorithm, i considerd the following: * for one-phase-access (UL), we don't know the multislot class, so i assume class 12, because it allows the maximum RX slots for semi-duplex operation. for two-phase-access we know the multislot class from the PACKET RESOURCE REQUEST. if the BTS provides 4 or more PDCH slots starting with TS1, the algorithm would select TS3 (iirc), because a later (concurrent) DL TBF with class 12 (or other classes) would allow to allocate TS1..TS4, so the phone can still transmit on TS3. the phone would schedule the following 8 bursts for one frame: "-tttt-r--" (t=TX, r=RX) if i would provide an UL TBF with TS1, the phone would be able to do the following scheduling only: "-tt-r---", so the algorithm can only assign TS1 and TS2 for RX. this means that the multislot algorithm must consider the RX slots when choosing a good TX slot, even with no ongoing DL TBF. * similar to an allocated UL slot, the algorithm must choose the same slot for "control_ts", that is used for polling. even for DL TBF only, the control_ts slot must be allocated as if it would be an UL TS, because the phone must be able to answer, if it is polled. the control_ts and and an assigned UL slot may be the same TS. * in case of first DL TBF assignment (phone is in packet idle mode), i use multislot algorithm too, because i know the multislot class from the LLC-DATA message. it is no problem to assign a single, but why do you want to assign a single slot here? best regards, andreas From holger at freyther.de Wed Dec 25 14:35:04 2013 From: holger at freyther.de (Holger Hans Peter Freyther) Date: Wed, 25 Dec 2013 15:35:04 +0100 Subject: Coverity issues in gsm_rlcmac.cpp In-Reply-To: References: <20131111192408.GD30839@xiaoyu.lan> Message-ID: <20131225143504.GA18757@xiaoyu.lan> On Tue, Nov 12, 2013 at 05:44:43PM +0400, Ivan Kluchnikov wrote: Dear Ivan, > I will prepare patch for this issues soon. it would be nice if we could finally close this issue. Do you have a patch ready for this? holger From holger at freyther.de Wed Dec 25 17:18:09 2013 From: holger at freyther.de (Holger Hans Peter Freyther) Date: Wed, 25 Dec 2013 18:18:09 +0100 Subject: Status of sysmocom/master W50 In-Reply-To: <52B1B6AC.6080903@eversberg.eu> References: <20131218124852.GA13933@xiaoyu.lan> <52B1B6AC.6080903@eversberg.eu> Message-ID: <20131225171809.GA24033@xiaoyu.lan> On Wed, Dec 18, 2013 at 03:52:28PM +0100, Andreas Eversberg wrote: > classes) would allow to allocate TS1..TS4, so the phone can still > transmit on TS3. the phone would schedule the following 8 bursts for > one frame: "-tttt-r--" (t=TX, r=RX) if i would provide an UL TBF > with TS1, the phone would be able to do the following scheduling > only: "-tt-r---", so the algorithm can only assign TS1 and TS2 for > RX. this means that the multislot algorithm must consider the RX > slots when choosing a good TX slot, even with no ongoing DL TBF. Could you add TS boundaries to your ascii art? I still didn't read/ understand the constraints with TrX. > * in case of first DL TBF assignment (phone is in packet idle mode), > i use multislot algorithm too, because i know the multislot class > from the LLC-DATA message. it is no problem to assign a single, but > why do you want to assign a single slot here? It is my understanding of the current code. In case we start with a DL TBF. We will need to create an immediate assignment and send it through the PCH. On the immediate assignment we can only announce one TS/PDCH. (We could follow-up with another control command to inform the phone about the other dl assignments it has). PS: My scheduling "fair-ness" patches create another issue. The code should make sure that a DL_ASSIGNMENT is scheduled _before_ the final uplink ack/nack. Even if it requires a bit more memory I think it would make sense if we have a small CTRL command queue inside the TBF. From hfreyther at sysmocom.de Wed Dec 25 20:16:12 2013 From: hfreyther at sysmocom.de (Holger Hans Peter Freyther) Date: Wed, 25 Dec 2013 21:16:12 +0100 Subject: Status of sysmocom/master W50 In-Reply-To: <20131225171809.GA24033@xiaoyu.lan> References: <20131218124852.GA13933@xiaoyu.lan> <52B1B6AC.6080903@eversberg.eu> <20131225171809.GA24033@xiaoyu.lan> Message-ID: <20131225201612.GA6756@xiaoyu.lan> On Wed, Dec 25, 2013 at 06:18:09PM +0100, Holger Hans Peter Freyther wrote: > On Wed, Dec 18, 2013 at 03:52:28PM +0100, Andreas Eversberg wrote: > > > classes) would allow to allocate TS1..TS4, so the phone can still > > transmit on TS3. the phone would schedule the following 8 bursts for > > one frame: "-tttt-r--" (t=TX, r=RX) if i would provide an UL TBF > > with TS1, the phone would be able to do the following scheduling > > only: "-tt-r---", so the algorithm can only assign TS1 and TS2 for > > RX. this means that the multislot algorithm must consider the RX > > slots when choosing a good TX slot, even with no ongoing DL TBF. > > Could you add TS boundaries to your ascii art? I still didn't read/ > understand the constraints with TrX. Okay. you are already interleaving Dl/UL and the 3 timeslot offset between DL/UL. It wasn't clear from your graph. -- - Holger Freyther http://www.sysmocom.de/ ======================================================================= * sysmocom - systems for mobile communications GmbH * Schivelbeiner Str. 5 * 10439 Berlin, Germany * Sitz / Registered office: Berlin, HRB 134158 B * Geschaeftsfuehrer / Managing Directors: Holger Freyther, Harald Welte From hfreyther at sysmocom.de Thu Dec 26 10:03:13 2013 From: hfreyther at sysmocom.de (Holger Hans Peter Freyther) Date: Thu, 26 Dec 2013 11:03:13 +0100 Subject: Multislot allocation failures and defects Message-ID: <20131226100313.GA19501@xiaoyu.lan> Dear Andreas, I went through the multislot allocation algorithm and cleaned and structured the code in testable parts that take over parts of the allocation and I have added a unit test for some of the allocation cases (UL first, then DL.. e.g. due a RACH burst and then DL single, UL and DL update). Today morning I added a testcase that tests all PDCH combinations and all MS Classes and verifies that an allocation takes place and that the DL and UL first_common_ts do match. For (xxOxOOxO) and MS_Class=5 this assert is wrong. The alloc/AllocTest of the sysmocom/allocation-cleanups branch should show you this condition. While reading the code I noticed the following things. Could you please explain why these are problems or no problems. * select_first_ts (it exists after the refactoring). It initializes i and compares it but it will never increment it. This means that the code can look at PDCHs _outside_ of the tx_range. * first_common_ts handling. When assigning the DL tbf we pick a first_common_ts but when the actual UL assignment happens there might not be a free USF on the Uplink and at that point the phone might not listen on the TS we think. * Sum for Rx+Tx is not used. I see that in update_rx_win_max you modify the window to make some room. * select_ul_slots. "i" is not incremented in all cases which could potentially lead to using slots outside of the tx_range. For the MS Type == 1 handling you could introduce a different variable that counts how many slots were used? it would be nice if we could fix that during the 30C3. holger -- - Holger Freyther http://www.sysmocom.de/ ======================================================================= * sysmocom - systems for mobile communications GmbH * Schivelbeiner Str. 5 * 10439 Berlin, Germany * Sitz / Registered office: Berlin, HRB 134158 B * Geschaeftsfuehrer / Managing Directors: Holger Freyther, Harald Welte From Ivan.Kluchnikov at fairwaves.ru Mon Dec 30 10:47:53 2013 From: Ivan.Kluchnikov at fairwaves.ru (Ivan Kluchnikov) Date: Mon, 30 Dec 2013 14:47:53 +0400 Subject: Coverity issues in gsm_rlcmac.cpp In-Reply-To: <20131225143504.GA18757@xiaoyu.lan> References: <20131111192408.GD30839@xiaoyu.lan> <20131225143504.GA18757@xiaoyu.lan> Message-ID: Hi Holger, I merged my patch to the master branch. Please, check it. 2013/12/25 Holger Hans Peter Freyther : > On Tue, Nov 12, 2013 at 05:44:43PM +0400, Ivan Kluchnikov wrote: > > Dear Ivan, > >> I will prepare patch for this issues soon. > > it would be nice if we could finally close this issue. Do you have > a patch ready for this? > > holger -- Regards, Ivan Kluchnikov. http://fairwaves.ru