On 26. Jun 2018, at 15:27, Harald Welte
<laforge(a)gnumonks.org> wrote:
Hey!
In that
specific case only 9/20 MS completed an Update Location within 60s. Maybe I am missing a
state transition in the LUA code and don't report the success. I will need to
understand this better.
That's of course odd. But it should be rather easy to investigate if
looking at an Abis or A-interface trace? I think we recently had an
issue that we'd only use SDCCH for LU and never fall back to "bigger"
channels like TCH. Maybe that's limiting your LU rate, particularly if
you only have a SDCCH/4?
Sure. But that is what we want to test as well? Immediate Assignment
Rejects and back-off handling. What happens is more funny though. While
looking at the RSL trace I noticed that in "CHANNEL REQUIRED" the T1,
T2, T3 of consecutive requests were the same.
All MS started at the _same_ second will pick the same random numbers.
These are used for the RACH request and also the back-off handling. The
obvious change is to go from random() to osmo_get_rand_id and indeed
things are a lot better. This makes a big difference but has uncovered
another issue. We have a big spread between when the LU starts and it
finishes.
One example is the Immediate Assignment Reject handling. IIRC Jacob
implemented grouping multiple rejects into a single reject message but
now we give the same "Wait Indication" in all four entries. We need to
do a bit better than that. Is there a "master" bug for things to improve
with many MS?
The other thing I wondered about. Maybe we keep a small ring buffer of
bits we have seen in channel required messages? If we see requests in
near by GSM frames there is little point in answering _any_ of them. We
might just create radio interference between them...
cheers
holger