On 18 Aug 2016, at 00:28, Keith
<keith(a)rhizomatica.org> wrote:
Good Morning Keith,
* List
libosmocore/libosmo-abis/../OpenBSC/osmo-bts git commit (+patches if it makes sense)
Unknown, as I didn't build it, but we could dispense with this, and I
can just compile from a know good/recent commit and go from there.
well, dpkg -l and changelogs should give you an idea what rhizomatica has deployed. But
without knowing which versions are in place, it is extremely difficult to look.
Same for the BTS and firmware version (opkg list_installed). E.g. it wouldn't make
sense to chase issues in old firmware releases.
> * Attach both BTS and NITB config
missing. Very crucial part of the puzzle. My assumptions right now:
* Do you use depends-on-bts to add dependency between BTS #1 and BTS #2 of the sysmoBTS
2050?
* PCAP file
(if it doesn't include personal data, otherwise try to pseudomize it or share out of
band).
I think specifically a capture of the point at which this kind of thing
starts to happen would be the interesting part, although I have to sit
around for a while to get it. Not sure how to filter then the resulting
pcap to just that part but I'll figure it out.
a good point will be to filter for "bts=0, trx=0, ts=1, ss=0" in RSL packages.
<0004> abis_rsl.c:630 (bts=1,trx=0,ts=1,ss=0) is
back in operation.
<0004> abis_rsl.c:615 (bts=0,trx=0,ts=1,ss=0) DEACTivate SACCH CMD
<0004> abis_rsl.c:1102 (bts=0,trx=0,ts=1,ss=0): MEAS RES for inactive
channel
<0004> abis_rsl.c:1330 (bts=0,trx=0,ts=1,ss=0) SACCH deactivation timeout.
<0004> abis_rsl.c:661 (bts=0,trx=0,ts=1,ss=0) RF Channel Release CMD due
error 1
<0004> abis_rsl.c:1102 (bts=0,trx=0,ts=1,ss=0): MEAS RES for inactive
channel
<0004> abis_rsl.c:223 (bts=0,trx=0,ts=1,ss=0) Timeout during
deactivation! Marked as broken.
So SACCH didn't detect signal, error handling, channel release and then no answer. If
you enable timestamps in the printing we would see how much time has passed. But it must
be at least four seconds.
now, Just as soon as what I said in a previous mail
about not having
seen it, those rsl log entries are followed by
<0004> abis_rsl.c:717 (bts=0,trx=0,ts=1,ss=0) RF CHANNEL RELEASE ACK
<0004> abis_rsl.c:735 (bts=0,trx=0,ts=1,ss=0) CHAN REL ACK for broken
channel. Releasing it.
So here it "repaired it"
do_lchan_free()
...
} else {
rsl_lchan_set_state(lchan, LCHAN_S_NONE);
}
lchan_free(lchan);
However, this follows in the log and these remain
broken "forever"
<0004> abis_rsl.c:1198 (bts=0,trx=0,ts=1,ss=1) CHANNEL ACTIVATE ACK
<0004> abis_rsl.c:949 (bts=0,trx=0,ts=1,ss=1) CHAN ACT ACK for broken
channel.
We are missing the activation. But only non-broken channels should be allocated. So it
looks like the BTS(firmware) had enough? We can add the same "freeing" to the
channel_act_ack as well. But you should have a look if:
* Delay is added by network
* Delay is added by BTS being too busy
holger
PS: With TCP it is not possible for something that has been written into the socket to
arrive out-of-order. It might be that the BTS has not written into it (but that
doesn't seem to be the case either).