Good Morning,
we at sysmocom continued to add structure/architecture to the PCU code and started to systematically test the code (using something as simple as ICMP ping/reply)
Observations:
* "alloc algorithm b" has little (3kbit/s) more bandwidth than just using the single slot assignment with CS4. There are also some open coverity issues about this multi slot code.
* Congestion is badly handled. When using a ping that creates IP fragments the LLC queue of the TBF gets so long that we expire half the IP fragments before sending them. This means the MS at the downlink never receives all the IP fragments of a PING.
We played with the BSVC-FLOW-CONTROL parameters but they don't have any influence here. We have added counting of queue size and calculation of the average queue delay to the LLC class.
One option is to drop packages based on the PDU lifetime and the current queue delay.
$ sudo ping -s 1800 -i 0.1 10.23.42.6 (CS4, alloc-algorithm a) is enough to re-produce this issue.
* RLC V(Q) V(R) handling. It appears that after and before the refactoring the Received Block Bitmap (RBB) is wrongly encoded. The encode and decode are not compatible to each other (thanks to the architecture sysmocom is adding to the code this can now be unit tested!).
So as soon as we don't receive a frame from the UL. The send window at the phone and the PCU is going out of sync.
After making changes to the window we have also witnesses that on a fully received window V(Q) and V(R) were WindowSize away from each other.
<0005> tbf.cpp:1579 UL DATA TFI=0 received (V(Q)=55 .. V(R)=60) <0005> tbf.cpp:1624 - BSN 60 storing in window (55..118) <0005> rlc.cpp:173 - Raising V(R) to 61 <0005> tbf.cpp:1690 - Scheduling Ack/Nack, because 20 frames received. <0005> encoding.cpp:374 Encoding Ack/Nack for TBF(TFI=0 TLLI=0xfa45049d DIR=UL) (final=0) SSN=61 <0005> encoding.cpp:408 - V(N): "RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR" R=Received N=Not-Received
This part might be something we broke but we will try to re-produce with a unit test.
We will continue to work on unit-testing the RLC window handling if anyone outside sysmocom interested in getting a more stable PCU?
holger
- RLC V(Q) V(R) handling. It appears that after and before the refactoring the Received Block Bitmap (RBB) is wrongly encoded. The encode and decode are not compatible to each other (thanks to the architecture sysmocom is adding to the code this can now be unit tested!).
dear holger,
is it already fixed? how can i test it? running "make check" does not cause any failure. (i use zecke/features/clean-up branch)
So as soon as we don't receive a frame from the UL. The send window at the phone and the PCU is going out of sync.
After making changes to the window we have also witnesses that on a fully received window V(Q) and V(R) were WindowSize away from each other.
<0005> tbf.cpp:1579 UL DATA TFI=0 received (V(Q)=55 .. V(R)=60)
this means that our receive array contains blocks from 55 to 59...
<0005> tbf.cpp:1624 - BSN 60 storing in window (55..118) <0005> rlc.cpp:173 - Raising V(R) to 61
as we received block 60, our receive array contains blocks from 55 to 60 now. in your code i can see that "gprs_rlc_ul_window::raise_v_q(gprs_rlc_v_n *v_n)" is called. since there is no debugging that shows something like "- Taking block 55 out, raising V(Q) to 56", i assume that the if-condition "if (!v_n->is_received(v_q()))" is positive. this means that v_n.state(55) is not 'R'...
<0005> tbf.cpp:1690 - Scheduling Ack/Nack, because 20 frames received. <0005> encoding.cpp:374 Encoding Ack/Nack for TBF(TFI=0 TLLI=0xfa45049d DIR=UL) (final=0) SSN=61 <0005> encoding.cpp:408 - V(N): "RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR" R=Received N=Not-Received
so i would expect that there is no 'R' at position last but 5 in this debug string, but there is one.
is there a way to reproduce this? you said "So as soon as we don't receive a frame from the UL. The send window at the phone and the PCU is going out of sync.". does this mean that it can be reproduced by dropping one frame on the uplink? (e.g. dropping frame with BSN 55 once.)
best regards,
andreas
On Fri, Dec 06, 2013 at 01:35:09PM +0100, Andreas Eversberg wrote:
is it already fixed? how can i test it? running "make check" does not cause any failure. (i use zecke/features/clean-up branch)
This is outdated. I will update sysmocom/master and sysmocom/clean-up today.
is there a way to reproduce this? you said "So as soon as we don't receive a frame from the UL. The send window at the phone and the PCU is going out of sync.". does this mean that it can be reproduced by dropping one frame on the uplink? (e.g. dropping frame with BSN 55 once.)
exactly. Dropping UL frames should be enough to re-produce it. But have a look at tests/types/TypesTest.cpp. The nice part of separating the spaghetti into classes/modules is that they can be unit tested. So we should be able to easily re-produce the issue with a testcase now.
holger
osmocom-net-gprs@lists.osmocom.org