Gentlemen,

While upgrading to osmo-bts-trx_0.8.1.199.5c93_armhf.deb solved most of the problem, the main

issue seems to remain. The latest version reduced failure rate from 1-3 / hour, to 18 hours, but

did not eliminate the problem, i.e. a spinning trx-uhd. I enclose a snippet from console:

Thu Dec 27 18:53:00 2018 DMAIN <0000> Transceiver.cpp:1005 [tid=3007431760] new latency: 7:33599 (underrun 1:1683315 vs 6:1683313)
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DMAIN <0000> Transceiver.cpp:1005 [tid=3007431760] new latency: 7:33600 (underrun 6:1683326 vs 1:1683325)
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DMAIN <0000> Transceiver.cpp:1005 [tid=3007431760] new latency: 7:33601 (underrun 5:1683338 vs 6:1683336)
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DDEV <0002> UHDDevice.cpp:1319 [tid=2997609552] Packet loss between host and device at 103454 sec.
Thu Dec 27 18:53:00 2018 DMAIN <0000> Transceiver.cpp:1005 [tid=3007431760] new latency: 7:33602 (underrun 6:1683350 vs 5:1683348)

as you can see this is very similar to running osmo-bts-trx_0.8.1.194.8564_armhf.deb, but only happened after 18 hours seemingly stable.

syslog says:

Dec 27 18:53:37 localhost osmo-bts-trx[2166]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
Dec 27 18:53:39 localhost osmo-bts-trx[2166]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
Dec 27 18:53:41 localhost osmo-bts-trx[2166]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
Dec 27 18:53:43 localhost osmo-bts-trx[2166]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)

so bts-trx and trx-uhd have lost synch with each other.

Just restarting trx-uhd resolved the situation, if bts-trx had timed out, it would have restarted, and I assume the whole thing recovered.

next time it stops, I will kill -9 bts-trx, it should restart auto, and I will see if it does not matter WHICH process is restarted, I earlier

believed the problem was in the trx-uhd, but replacing bts-trx was the factor to go to "almost stable".

I will report my findings, if you have any more intelligent suggestions to how to narrow this down, please go ahead.....

Regards,

Gullik



On 2018-12-27 10:15, Gullik Webjorn wrote:
I have had 100% good stability over the night. However, osmo-bts-trx logs the following every 5-10 minutes.

There has been no "activity" nightly on the gsm network.

Gullik

Dec 27 09:08:57 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
Dec 27 09:09:00 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:720 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN REL ACK
Dec 27 09:10:12 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
Dec 27 09:10:15 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:720 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN REL ACK
Dec 27 09:12:03 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
Dec 27 09:12:03 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:920 Store content res. (dl=0xb67ec498)
Dec 27 09:12:04 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:12:04 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:12:05 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:12:06 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:720 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN REL ACK

Dec 27 09:17:27 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
Dec 27 09:17:27 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:920 Store content res. (dl=0xb67ec498)
Dec 27 09:17:28 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:17:29 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:17:29 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:17:30 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:720 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN REL ACK

Dec 27 09:32:46 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
Dec 27 09:32:46 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:920 Store content res. (dl=0xb67ec498)
Dec 27 09:32:48 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:32:48 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:32:48 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:32:49 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:720 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN REL ACK

Dec 27 09:42:04 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
Dec 27 09:42:05 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:920 Store content res. (dl=0xb67ec498)
Dec 27 09:42:06 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:42:06 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:42:06 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:42:08 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:720 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN REL ACK

Dec 27 09:47:32 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
Dec 27 09:47:32 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:920 Store content res. (dl=0xb67ec498)
Dec 27 09:47:33 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:47:34 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:47:34 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 09:47:35 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:720 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN REL ACK

Dec 27 10:02:51 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
Dec 27 10:02:51 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:920 Store content res. (dl=0xb67ec498)
Dec 27 10:02:52 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 10:02:53 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 10:02:53 localhost osmo-bts-trx[20109]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb67ec498 state LAPD_STATE_MF_EST)
Dec 27 10:02:54 localhost osmo-bts-trx[20109]: #033[0;m#033[1;35m<0000> rsl.c:720 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN REL ACK