On 04 May 2016, at 14:01, gitosis@osmocom.org wrote:
Hi
msc subscr: add paging timeout
In NITB, the paging timeout would be handled from the BSC side. In IuCS, we need to invalidate the paging request from libmsc alone, so add a paging timer to gsm_subscriber.
Possibly, the HNB-GW should respond with a paging failure and libmsc could trigger on that, nevertheless libmsc should not rely on a failure message to expire pending pagings.
I don't think this belongs into the gsm_subscriber. If you compare this with the architectural change that removed the "subscriber queue" we are going to repeat the same mistake again. The timeout belongs into the logical "operation" and not the subscriber (the object of the logical operation).
cheers holger
On Wed, May 04, 2016 at 02:37:26PM +0200, Holger Freyther wrote:
I don't think this belongs into the gsm_subscriber. If you compare this with the architectural change that removed the "subscriber queue" we are going to repeat the same mistake again. The timeout belongs into the logical "operation" and not the subscriber (the object of the logical operation).
I must admit that I haven't fully grokked the master branch aka NITB paging semantics. I've just taken a closer look. It's quite complex...
Here is how the paging_timeout ended up in gsm_subscriber:
First I wanted to add the paging timeout to a gsm_subscriber_connection, which obviously doesn't make sense. There is no conn yet when we want to page.
The next thought would be to add the timeout to struct subscr_request. But taking a look at subscr_request_conn(), I see that gsm_subscriber has an is_paging flag, and when a request comes along while we're already paging, it is just added to the list without launching another paging [1]. So it seems there is a paging semantic that is separate from the request, that's why is_paging is in gsm_subscriber I presume, and that's why I put the paging timeout next to that, but read on.
[1] see subscr_request_conn() on sysmocom/iu, or subscr_request_channel() on master
Now I understand that the libbsc paging is much more than just one paging request. There's the bts->paging.work_timer that keeps retrying to page as long as requests exist. It rotates every first request to the back and does another paging; I don't really understand the part where the "available slots" of 20 are reduced to zero, where the paging_give_credit() is queued with a separate timeout, which sets the slots back to 20 (20 credits??). Nevertheless, I understand now that is_paging is not about a single paging, but remembers whether we can skip kicking off the cascade of paging retries. Then there's the separate req->T3113 timer which separately removes each request from the list of requests waiting for a paging.
I could re-implement this kind of algorithm in libmsc:
* Upon first request, start paging * Retry paging "indefinitely" while requests are queued ("paging worker") * Upon second or later request, just add to the queue * Time out each request separately * Stop paging once the queue of requests is empty
The current status quo is somewhat simpler:
* Upon first request, send out a single paging request * Upon second or later request, add to the queue * If the single paging timed out, discard all requests
If I re-implement the more complex algorithm like in libbsc, I would need a paging worker in the gsm_subscriber struct, and a timeout per each subscr_request.
OTOH, if the BSC level already has the paging-retries semantics built-in, then it makes sense to send out only a single paging request from the MSC level.
It's not as clear cut. osmo-bsc receives a paging request from the MSC, and does numerous paging retries. But it doesn't store the requests in detail, it even has the request callbacks set to NULL. All that happens during a paging response is that a connection to the MSC is initiated. So the knowledge for various detailed requests is in the MSC, while the paging retries still are in osmo-bsc.
So it's not trivial to figure out how the various levels should behave. The MSC should not have the same paging resend logic as the BSC, or if it does then with a much longer timeout.
It would be interesting to find out whether RNCs resend pagings like our osmo-bsc.
I would appreciate opinions on how standalone MSC level paging should behave in detail.
Also, could someone please explain the 20 slots and "give credit" semantics I mentioned above?
~Neels
On 06 May 2016, at 20:12, Neels Hofmeyr nhofmeyr@sysmocom.de wrote:
On Wed, May 04, 2016 at 02:37:26PM +0200, Holger Freyther wrote:
Hi Neels,
not sure what to respond to the rest as it is long and goes into another area and I am very busy right now.
Also, could someone please explain the 20 slots and "give credit" semantics I mentioned above?
the ip.access nanoBTS crashed with too much paging, it crashed before it would send out watermark information that would have allowed us to throttle/stop paging. So the "slots" is a way to make proprietary equipment work.
On Sat, May 07, 2016 at 07:26:03AM +0200, Holger Freyther wrote:
not sure what to respond to the rest as it is long and goes into another area and I am very busy right now.
Ok -- I'll keep it simple until we see a need for more elaborate paging timeout semantics. So, I will actually keep the timeout on gsm_subscriber level for now, but will gladly change that when reasons to do so come up.
Also, could someone please explain the 20 slots and "give credit" semantics I mentioned above?
the ip.access nanoBTS crashed with too much paging, it crashed before it would send out watermark information that would have allowed us to throttle/stop paging. So the "slots" is a way to make proprietary equipment work.
Ok, so it does at most 20 pagings and then waits 5 seconds before firing the next 20.
BTW, re give_credit(): the meaning of "to give credit to someone" is more like acknowledging that someone has done something well (Anerkennung geben). That's why I was unsure about its purpose.
~Neels
On 08 May 2016, at 17:02, Neels Hofmeyr nhofmeyr@sysmocom.de wrote:
On Sat, May 07, 2016 at 07:26:03AM +0200, Holger Freyther wrote:
not sure what to respond to the rest as it is long and goes into another area and I am very busy right now.
Ok -- I'll keep it simple until we see a need for more elaborate paging timeout semantics. So, I will actually keep the timeout on gsm_subscriber level for now, but will gladly change that when reasons to do so come up.
I think I understand your point of view now. Without the GSM paging code there is no component that will inform the SMS/Call code about timeouts. From this point of view the GSM subscriber is a good place to put this timeout.
Have you considered removing the "timeout" from the lower GSM paging code? The interface would be start, cancel and only success would be signaled? Both the MSC paging and BSC paging timeout could be fed from the same timeout value?
Ok, so it does at most 20 pagings and then waits 5 seconds before firing the next 20.
BTW, re give_credit(): the meaning of "to give credit to someone" is more like acknowledging that someone has done something well (Anerkennung geben). That's why I was unsure about its purpose.
top-up? refill? resume paging?
On Sun, May 08, 2016 at 05:13:51PM +0200, Holger Freyther wrote:
Have you considered removing the "timeout" from the lower GSM paging code? The interface would be start, cancel and only success would be signaled? Both the MSC paging and BSC paging timeout could be fed from the same timeout value?
So the MSC tells the BSC to start and stop paging, such that the T3113 is in the MSC? I'd expect the network design to be intended otherwise, to reduce the amount of work towards the CN (handle as many details as possible in BSC).
And we want a way that works both with 2G and 3G. Can we tell an RNC to start and cancel paging? Not that I know of ...?
Another thing: I guess that osmo-bsc should not change behavior with respect to 3rd party MSCs, right?
top-up? refill? resume paging?
/me favors 'resume_paging' but whether we should spend time changing that is on another page.
~Neels
On 09 May 2016, at 11:49, Neels Hofmeyr nhofmeyr@sysmocom.de wrote:
On Sun, May 08, 2016 at 05:13:51PM +0200, Holger Freyther wrote:
Have you considered removing the "timeout" from the lower GSM paging code? The interface would be start, cancel and only success would be signaled? Both the MSC paging and BSC paging timeout could be fed from the same timeout value?
So the MSC tells the BSC to start and stop paging, such that the T3113 is in the MSC? I'd expect the network design to be intended otherwise, to reduce the amount of work towards the CN (handle as many details as possible in BSC).
No the MSC just commands to page. But it has an internal timer when it gives up and even if there is a paging response will not handle it anymore.
And we want a way that works both with 2G and 3G. Can we tell an RNC to start and cancel paging? Not that I know of ...?
Neither. So it makes sense to have the timeout in the lower layers and one in the MSC/subscriber layer. Only the subscriber layers' timeout should trigger the callback. That is my opinion for the long run architecture.
Another thing: I guess that osmo-bsc should not change behavior with respect to 3rd party MSCs, right?
right.
top-up? refill? resume paging?
/me favors 'resume_paging' but whether we should spend time changing that is on another page.
true.