Hi!
There is a more or less common pattern going through a lot of the problems that we've been debugging recently, e.g.: * ip.access BTS OML initialization sometimes doesn't complete * BS-11 initialization problems (no BCCH content)
When I started to write OpenBSC (at that time still called bs11-abis), I didn't know much about GSM 12.21 and was simply starting. As there is no explicit information in 12.21 or 08.59 on whether or not there can be multiple outstanding OML requests on one OML link, I simply assumed that I could send a number of commands without having to wait for their ACK.
This made it easy to make quick progress early on in the project.
However, now we are facing some problems, e.g. we send RSL messages before OML has completed its initialization. I suspect the BTSs don't like that in a number of cases.
Possible solutiosn to this:
1) Always store the last OML command that we have issued and wait for an explicit ACK/NACK before sending the next one. This means all msgb's sent down by abis_nm_sendmsg() will end up in a queue which is dequeued by ACK/NACK responses before sending the next command.
It's simple to implement and probably solves some of the ordering issues. However, the caller has no idea when the queued operation will actually complete (or if it will complete)
2) Implement some kind of blocking OML layer that will block the caller of abis_nm_sendmsg() until the ACK/NACK has been received. This means that writing the OML code will be much more natural, i.e. if the BTS returns an error, the OML code can deal with it at exactly the message that has caused the error.
However, this would imply that OML bringup would have to spawn one thread for each BTS that is about to be brought up, as we cannot block the full BSC just because one of the BTS's is reinitialized.
This would be a big difference from the existing non-blocking asynchronous single-process + single-thread model that we have, and there is probably a bit of thinking required how this would affect concurrent accesses to our data structures. As OML is fairly independent from everything else, I don't think it will be much of an issue, though.
At the moment I'm slightly more inclined to actually go for '2', since it is a cleaner solution from my point of view.
What do you think?
Regards, Harald
On 06/21/2010 11:54 PM, Harald Welte wrote:
At the moment I'm slightly more inclined to actually go for '2', since it is a cleaner solution from my point of view.
What do you think?
The consequences for threading are big. As we can do OML and the BTS might pass away (bsc_unregister_fd) we need locking at quite some places and these include
- msgb_enqueue/msgb_dequeue (or shortly before) - bsc_unregister_fd (combined with thread cancellation for the OML threads)
And we would always have a OML thread per BTS? And an OML msg with 0xff, 0xff, 0xff would go to the BTS holding the BCCH?
I see how the blocking semantic of an opstart and such is very appealing, we do not need to worry about the queue but the kernel will queue messages for us.
On 06/22/2010 09:35 AM, Holger Hans Peter Freyther wrote:
On 06/21/2010 11:54 PM, Harald Welte wrote:
At the moment I'm slightly more inclined to actually go for '2', since it is a cleaner solution from my point of view.
What do you think?
I see how the blocking semantic of an opstart and such is very appealing, we do not need to worry about the queue but the kernel will queue messages for us.
Today I was searching for Coroutines and Tasklets for C again and wonder if we could use that, we should not rely on createcontext and such as it is not available on ARM. Another option would be to use fork and have a socketpair between OpenBSC OML and OpenBSC proper and forward OML messages in both ways.
We could kill the process whenever the BTS is gone, and create it once it is up. This would also mean config changes would be handled on every BTS reconnect..
And I felt lazy and created zecke/nm-long-queue which appears to work for BTS bringup and ipaccess-config -r -o IP BTS...
Hi Sylvain,
On Tue, Jun 22, 2010 at 07:39:01AM +0200, Sylvain Munaut wrote:
At the moment I'm slightly more inclined to actually go for '2', since it is a cleaner solution from my point of view.
What do you think?
What about things such as protothreads ? (just a thought, I never actually used them or any similar libs).
I once did some investigation on them, but at that time decided that they wouldn't really be useful for the kind of problems we're facing in both OpenBSC and OsmocomBB. However, I don't recall the exact reason, so let me review this again...
I once did some investigation on them, but at that time decided that they wouldn't really be useful for the kind of problems we're facing in both OpenBSC and OsmocomBB. However, I don't recall the exact reason, so let me review this again...
Protothreads is just an example because I didn't recall the other names, but as Holger pointed out, it's the whole tasklet / coroutines stuff that might be worth considering.
One area where I could see them useful if we can make them 'transparent' enough would be the HLR were all the hlr calls could still appear as blocking in the gsm48 state machine even tough they wouldn't really be.
Sylvain