Change in osmo-bsc[master]: hodec2: fix congestion oscillation bug

This is merely a historical archive of years 2008-2021, before the migration to mailman3.

A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/gerrit-log@lists.osmocom.org/.

neels gerrit-no-reply at lists.osmocom.org
Tue Jan 5 22:28:52 UTC 2021


neels has uploaded this change for review. ( https://gerrit.osmocom.org/c/osmo-bsc/+/21989 )


Change subject: hodec2: fix congestion oscillation bug
......................................................................

hodec2: fix congestion oscillation bug

When evenly distributing congestion across cells, count the number of
occupied lchans surpassing congestion, and not the overall number of
free lchans -- which disregards congestion thresholds.

Fix the bug shown by test_congestion_no_oscillation.ho_vty added in
Idf88b4cf3d2f92f5560d73dae9e59af39d0494c0.

An example to illustrate what this is about:

Cell A has min-free-slots 2, and has 1 slot remaining free.
Cell B has min-free-slots 4, and has 2 slots remaining free.

If we decide where to place another lchan by counting congested lchans,
as implemented in this patch:
- Another lchan added, cell A ends up with a congestion count of 2: two
  more lchans in use than "allowed".
- Cell B ends up with a congestion count of 3, which is worse than 2.
We decide that cell A should receive the additional lchan, because it
will then have a lower congestion count.  However, that makes cell A
completely occupied, while cell B has two lchans remaining free.

There are two alternative fix variants in consideration:
- count the number of free lchans, but only after reaching congestion.
- calculate the percentage of load surpassing congestion.

When using percentage of remaining lchans, we would see that if cell A
receives another lchan, it would be 100% loaded above its congestion
threshold (2 of 2 remaining lchans in use), but cell B would only be 75%
loaded above its treshold (3 of 4 remaining lchans in use).  So a
percentage comparison would place the next lchan in cell B, leaving the
last lchan of cell A free.

Another option would be to count the number of remaining free lchans
(after the congestion threshold is surpassed), instead of the used ones
above the congestion threshold. But then, as soon as all cells are
congested, configuring different thresholds would no longer have an
effect. I would no longer be able to configure a particular cell to
remain more free than others: once congested, only that cell would fill
up until it reaches the same load as the other cells.

In the field, where all cells likely have the same min-free-slots
settings, this entire consideration is moot, because congestion counts
correspond 1:1 to percentage between all cells and also 1:1 to remaining
free slots. However, when looking at distribution across TCH/F and
TCH/H, it is quite likely that min-free-slots settings differ for TCH/F
and TCH/H, so this is in fact a thing to consider even for identically
configured cells.

Related: SYS#5259
Change-Id: Icb373dc6bfc9819446db5e96f71921781fe2026d
---
M src/osmo-bsc/handover_decision_2.c
M tests/handover/test_congestion_no_oscillation.ho_vty
M tests/handover/test_congestion_no_oscillation2.ho_vty
3 files changed, 11 insertions(+), 30 deletions(-)



  git pull ssh://gerrit.osmocom.org:29418/osmo-bsc refs/changes/89/21989/1

diff --git a/src/osmo-bsc/handover_decision_2.c b/src/osmo-bsc/handover_decision_2.c
index b296bd1..64ee298 100644
--- a/src/osmo-bsc/handover_decision_2.c
+++ b/src/osmo-bsc/handover_decision_2.c
@@ -443,6 +443,7 @@
 {
 	uint8_t requirement = 0;
 	unsigned int penalty_time;
+	int current_overbooked;
 	c->requirements = 0;
 
 	/* Requirement A */
@@ -625,14 +626,17 @@
 
 	/* Requirement C */
 
-	/* the nr of free timeslots of the target cell must be >= the
-	 * free slots of the current cell _after_ handover/assignment */
+	/* the nr of lchans surpassing congestion on the target cell must be <= the lchans surpassing congestion on the
+	 * current cell _after_ handover/assignment */
+	current_overbooked = c->current.min_free_tch - c->current.free_tch;
 	if (requirement & REQUIREMENT_A_TCHF) {
-		if (c->target.free_tchf - 1 >= c->current.free_tch + 1)
+		int target_overbooked = c->target.min_free_tchf - c->target.free_tchf;
+		if (target_overbooked + 1 <= current_overbooked - 1)
 			requirement |= REQUIREMENT_C_TCHF;
 	}
 	if (requirement & REQUIREMENT_A_TCHH) {
-		if (c->target.free_tchh - 1 >= c->current.free_tch + 1)
+		int target_overbooked = c->target.min_free_tchh - c->target.free_tchh;
+		if (target_overbooked + 1 <= current_overbooked - 1)
 			requirement |= REQUIREMENT_C_TCHH;
 	}
 
diff --git a/tests/handover/test_congestion_no_oscillation.ho_vty b/tests/handover/test_congestion_no_oscillation.ho_vty
index abfaef7..a830cbe 100644
--- a/tests/handover/test_congestion_no_oscillation.ho_vty
+++ b/tests/handover/test_congestion_no_oscillation.ho_vty
@@ -1,6 +1,5 @@
 # Do not oscillate handover from TCH/F to TCH/H on a neighbor due to congestion,
 # and then back to the original cell due to RXLEV.
-# Currently this test script shows the undesired oscillation.
 
 create-bts trx-count 1 timeslots c+s4 TCH/F TCH/F TCH/F TCH/F  TCH/F  TCH/F PDCH
 network
@@ -25,24 +24,5 @@
 # measurements continue to be the same
 meas-rep lchan 1 0 5 0 rxlev 20 rxqual 0 ta 0 neighbors 40
 
-# FAIL: RXLEV oscillation back to bts 0
-expect-ho from lchan 1 0 5 0 to lchan 0 0 2 0
-expect-ts-use trx 0 0 states        *    TCH/F TCH/F -     -      -      -     *
-expect-ts-use trx 1 0 states        *    TCH/F TCH/F TCH/F TCH/F  -      -     *
-meas-rep lchan 0 0 2 0 rxlev 40 rxqual 0 ta 0 neighbors 20
+# despite the better RXLEV, congestion prevents oscillation back to bts 0
 expect-no-chan
-
-# FAIL: congestion oscillation again to bts 1
-congestion-check
-expect-ho from lchan 0 0 2 0 to lchan 1 0 5 0
-expect-ts-use trx 0 0 states        *    TCH/F -     -     -      -      -     *
-expect-ts-use trx 1 0 states        *    TCH/F TCH/F TCH/F TCH/F  TCH/H- -     *
-
-# FAIL: RXLEV oscillation back to bts 0
-meas-rep lchan 1 0 5 0 rxlev 20 rxqual 0 ta 0 neighbors 40
-expect-ho from lchan 1 0 5 0 to lchan 0 0 2 0
-meas-rep lchan 0 0 2 0 rxlev 40 rxqual 0 ta 0 neighbors 20
-
-# FAIL: congestion oscillation again to bts 1
-congestion-check
-expect-ho from lchan 0 0 2 0 to lchan 1 0 5 0
diff --git a/tests/handover/test_congestion_no_oscillation2.ho_vty b/tests/handover/test_congestion_no_oscillation2.ho_vty
index aee731d..44c4176 100644
--- a/tests/handover/test_congestion_no_oscillation2.ho_vty
+++ b/tests/handover/test_congestion_no_oscillation2.ho_vty
@@ -1,8 +1,5 @@
-# Almost identical to test_amr_oscillation.ho_vty, this has just two more TCH/H slots in BTS 1, and does not trigger the
-# oscillation bug. The number of free TCH/H in BTS 1 should be unrelated to the congestion status of BTS 0, which
-# illustrates that the even distribution of congestion is fundamentally flawed.
-# This test script shows the desired behavior, though by common sense there should be no reason why we see the bug in
-# test_amr_oscillation.ho_vty and not here.
+# Almost identical to test_amr_oscillation.ho_vty, this has just two more TCH/H slots in BTS 1, and did not trigger the
+# oscillation bug (which has since been fixed, so that both tests behave identically now).
 
 create-bts trx-count 1 timeslots c+s4 TCH/F TCH/F TCH/F TCH/F  TCH/F  TCH/F PDCH
 network

-- 
To view, visit https://gerrit.osmocom.org/c/osmo-bsc/+/21989
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings

Gerrit-Project: osmo-bsc
Gerrit-Branch: master
Gerrit-Change-Id: Icb373dc6bfc9819446db5e96f71921781fe2026d
Gerrit-Change-Number: 21989
Gerrit-PatchSet: 1
Gerrit-Owner: neels <nhofmeyr at sysmocom.de>
Gerrit-CC: Jenkins Builder
Gerrit-MessageType: newchange
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osmocom.org/pipermail/gerrit-log/attachments/20210105/0a8c83ef/attachment.htm>


More information about the gerrit-log mailing list