BSC / MSC volatile state / restart handling

This is merely a historical archive of years 2008-2021, before the migration to mailman3.

A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/OpenBSC@lists.osmocom.org/.

Harald Welte laforge at gnumonks.org
Fri Oct 12 06:55:03 UTC 2018


Hi Keith,

On Thu, Oct 11, 2018 at 09:59:08PM +0200, Keith wrote:
> The issue is about restarting MSC or BSC (or both).

That's something that classic telecom doesn't typically consider very well.

In terms of the functional specs, it is assumed that restarting a network
element, particularly a core network element is a super rare occasion, and
hence nothing to be really considering as a general problem.

> I still really haven't looked at the split setup enough yet, 

OsmoNITB shouldn't really be any different.  Is it?

> Is there a plan to implement some kind of non-volatile state?

What you're referring to is basically the loss of VLR information. The 3GPP
specs explciitly consider it volatile and non-persistent.  Holger and I
were brainstorming about this some time ago, and we came up with the idea
of using a System V shared memory segment for the actual VLR data.  SysV
SHM has the nice property that it's regular mapped memory, but it is independent
of processes, i.e. you can restart a process while keeping whatever is in
that shared memory segment.

The devil is of course in the detail:

1) you may not be able to map it to the same address after each restart?
   Needs further investigation and might require some pointer fix-up, or
   the use of relative/offset addressing rather than absolute pointers.

2) what do you do if you're actually upgrading the software and the VLR
   related structures have been modified?  There either needs to be an
   explicit conversion function, or at least a mechanism to detect this
   safely and discard all the data in such situations

3) if the restart is a crash: What if some corrupted VLR data was actually
   the cause of the crash?  In that case you'd end up in a re-start loop.
   You'd have persistent crashes, rather than a one-off crash with recovery.

I've done some research on the web at that time (maybe 2 years ago) but
unfortunately couldn't find any library/tool/infrastructure for having
persistent data in SysV SHM, and also no other FOSS programs that did
so.  Maybe I didn't look closely enough?  To me, it seems like the most
obvious solution to persist state across crashes/restarts of C programs
on unix-type systems.

We explicitly don't want to use some kind of database system, as the VLR
data needs to be accessed all over the code
directly/synchronously/non-blockingly.  We cannot wait for it to be
retrieved from somewhere.  That's what is done with HLR data.

> In the case of restarting the MSC with two phones connected, of course
> the phones don't notice anything. One now needs to attempt a call setup
> at least 3 times, including a call setup attempt from the "callee" phone
> before we can even call it.

The alternative is to wait for any location update, either due to the
periodic location update timer expirign, or due to the phones actually
moving geographic location.

> I tried restarting both BSC and MSC, in various orders, and I did not
> get a satisfactory result. Of course, restarting the BSC restarts
> osmo-bts which causes (some) phones to notice the temporary loss of
> BCCH, but of course no LUR or anything as LAC doesn't change, so there's
> some state that is not getting set right someplace on restart, as I
> said. both phones need to interact before one of them can be called.

A BSC restart will always loose all active connections/channels/calls,
and I think there's nothing wrong with that.  Persisting that state
makes little sense, as the phones will all have closed their radio
channels at the time your BSC recovers.

-- 
- Harald Welte <laforge at gnumonks.org>           http://laforge.gnumonks.org/
============================================================================
"Privacy in residential applications is a desirable marketing option."
                                                  (ETSI EN 300 175-7 Ch. A6)



More information about the OpenBSC mailing list