osmo-uhd-trx

This is merely a historical archive of years 2008-2021, before the migration to mailman3.

A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/OpenBSC@lists.osmocom.org/.

Tomcsányi, Domonkos domi at tomcsanyi.net
Sat Dec 22 12:32:39 UTC 2018


Hello Gullik,

If possible please set up some kind of monitoring of key system parameters (load, memory etc.) so we can determine if there is any irregularities happening before the issue.
Also it would help us to see if there is any regularity in the time intervals the issue happens.
I for my servers use Zabbix for this, but it would be way too complicated I think for this purpose.
So any simple monitoring system that retains historic data (for the last, let’s say 24 hours) would be good.

Just my 2cents...

Domi


2018. dec. 22. dátummal, 12:25 időpontban Gullik Webjorn <gullik.webjorn at corevalue.se> írta:

> Hello Harald, long time.....
> 
> This is syslog for the time...
> 
> Dec 22 03:15:01 localhost CRON[25895]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> Dec 22 03:16:22 localhost osmo-bts-trx[18200]: #033[0;m<0007> l1sap.c:510 1338262/1009/16/22/46 Invalid condition detected: Frame difference is 1338262-1338205=57 > 1!
> Dec 22 03:17:01 localhost CRON[25911]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
> 
> here trx uhd has logged lots of lost packets
> 
> 03:36:19, so 2 seconds before next syslog entry, all within the same second, filling the terminal buffer, so I cannot see loggings immediately before
> 
> 
> 
> Dec 22 03:19:38 localhost osmo-bts-trx[18200]: #033[0;m#033[1;36m<0001> bts.c:250 Shutting down BTS 0, Reason No clock from osmo-trx
> 
> Dec 22 03:19:41 localhost osmo-bts-trx[18200]: #033[0;mShutdown timer expired
> Dec 22 03:19:41 localhost osmo-bts-trx[18200]: ((*))
> Dec 22 03:19:41 localhost osmo-bts-trx[18200]:   |
> Dec 22 03:19:41 localhost osmo-bts-trx[18200]:  / \ OsmoBTS
> Dec 22 03:19:41 localhost systemd[1]: osmo-bts-trx.service: Main process exited, code=exited, status=42/n/a
> Dec 22 03:19:41 localhost systemd[1]: osmo-bts-trx.service: Unit entered failed state.
> Dec 22 03:19:41 localhost systemd[1]: osmo-bts-trx.service: Failed with result 'exit-code'.
> Dec 22 03:19:43 localhost systemd[1]: osmo-bts-trx.service: Service hold-off time over, scheduling restart.
> Dec 22 03:19:43 localhost systemd[1]: Stopped Osmocom osmo-bts for osmo-trx.
> Dec 22 03:19:43 localhost systemd[1]: Started Osmocom osmo-bts for osmo-trx.
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: <0017> control_if.c:911 CTRL at 127.0.0.1 4238
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m<0010> telnet_interface.c:104 Available via telnet 127.0.0.1 4241
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m<0012> input/ipaccess.c:882 enabling ipaccess BTS mode, OML connecting to 127.0.0.1:3002
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m#033[1;33m<000b> trx_if.c:754 Open transceiver for phy0.0
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m<0012> input/ipa.c:128 127.0.0.1:3002 connection done
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m<0012> input/ipaccess.c:705 received ID get from 1801/0/0
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m#033[1;36m<0001> oml.c:229 O&M Get Attributes [0], Manufacturer Dependent State is unsupported by TRX.
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[0] (150 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[1] (180 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[2] (180 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[3] (1680 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[4] (520 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[5] (165 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[6] (1680 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m#033[1;36m<0001> oml.c:1055 ADM state already was Unlocked
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m<0012> input/ipa.c:128 127.0.0.1:3003 connection done
> Dec 22 03:19:43 localhost osmo-bts-trx[25938]: #033[0;m<0012> input/ipaccess.c:705 received ID get from 1801/0/0
> Dec 22 03:19:45 localhost osmo-bts-trx[25938]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 03:19:47 localhost osmo-bts-trx[25938]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 03:19:49 localhost osmo-bts-trx[25938]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> 
> 
> 
> I just had the same stopping again, here is a snippet from that trx terminal.........
> 
> Sat Dec 22 10:21:16 2018 DMAIN <0000> Transceiver.cpp:1016 [tid=3007554640] reduced latency: 3:9
> Sat Dec 22 10:21:17 2018 DMAIN <0000> Transceiver.cpp:1039 [tid=3007521872] ClockInterface: sending IND CLOCK 292365
> Sat Dec 22 10:21:17 2018 DMAIN <0000> Transceiver.cpp:1005 [tid=3007554640] new latency: 3:10 (underrun 3:292376 vs 1:292229)
> Sat Dec 22 10:21:18 2018 DMAIN <0000> Transceiver.cpp:1039 [tid=3007521872] ClockInterface: sending IND CLOCK 292581
> Sat Dec 22 10:21:18 2018 DMAIN <0000> Transceiver.cpp:1016 [tid=3007554640] reduced latency: 2:10
> Sat Dec 22 10:21:19 2018 DMAIN <0000> Transceiver.cpp:1039 [tid=3007521872] ClockInterface: sending IND CLOCK 292797
> Sat Dec 22 10:21:19 2018 DMAIN <0000> Transceiver.cpp:1016 [tid=3007554640] reduced latency: 1:10
> Sat Dec 22 10:21:19 2018 DDEV <0002> UHDDevice.cpp:861 [tid=3007521872] No packet received, implementation timed-out
> Sat Dec 22 10:21:19 2018 DDEV <0002> UHDDevice.cpp:865 [tid=3007521872] UHD: Receive timed out
> Sat Dec 22 10:21:19 2018 DMAIN <0000> radioInterfaceResamp.cpp:178 [tid=3007521872] Receive error 0
> Sat Dec 22 10:21:19 2018 DMAIN <0000> Transceiver.cpp:908 [tid=3007521872] radio Interface receive failed, requesting stop.
> Sat Dec 22 10:21:19 2018 DMAIN <0000> Transceiver.cpp:1005 [tid=3007554640] new latency: 1:11 (underrun 0:292830 vs 0:292819)
> Sat Dec 22 10:21:19 2018 DDEV <0002> UHDDevice.cpp:861 [tid=3007521872] An internal receive buffer has filled at 3113.55 sec.
> Sat Dec 22 10:21:20 2018 DDEV <0002> UHDDevice.cpp:1485 [tid=3007521872] Skipping buffer data: timestamp=1245558064 time_end=1245418309
> Sat Dec 22 10:21:20 2018 DMAIN <0000> Transceiver.cpp:1005 [tid=3007554640] new latency: 1:12 (underrun 3:292843 vs 2:292840)
> Sat Dec 22 10:21:20 2018 DMAIN <0000> Transceiver.cpp:1005 [tid=3007554640] new latency: 1:13 (underrun 3:292854 vs 4:292853)
> Sat Dec 22 10:21:20 2018 DMAIN <0000> osmo-trx.cpp:435 [tid=3025046000] Shutting down transceiver...
> Sat Dec 22 10:21:20 2018 DMAIN <0000> Transceiver.cpp:307 [tid=3025046000] Stopping the transceiver
> Sat Dec 22 10:21:20 2018 DMAIN <0000> Transceiver.cpp:320 [tid=3025046000] Stopping the device
> Sat Dec 22 10:21:20 2018 DMAIN <0000> Transceiver.cpp:333 [tid=3025046000] Transceiver stopped
> 
> This is syslog from right before this, I cut a few lines before the first osmo-bts-trx logging at 10:00:26
> 
> It complains a few times within that few seconds....then at 10:11:08 , 11, then stopping at 10:21:21
> 
> Dec 22 09:57:12 localhost dbus[779]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
> Dec 22 09:57:12 localhost systemd[1]: Started Network Manager Script Dispatcher Service.
> Dec 22 09:57:12 localhost nm-dispatcher: req:1 'dhcp4-change' [eth0]: new request (2 scripts)
> Dec 22 09:57:12 localhost nm-dispatcher: req:1 'dhcp4-change' [eth0]: start running ordered scripts...
> Dec 22 10:00:01 localhost CRON[29739]: (root) CMD (/usr/lib/armbian/armbian-truncate-logs)
> Dec 22 10:00:26 localhost osmo-bts-trx[25938]: #033[0;m#033[1;35m<0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
> Dec 22 10:00:26 localhost osmo-bts-trx[25938]: #033[0;m<0011> lapd_core.c:920 Store content res. (dl=0xb681b498)
> Dec 22 10:00:27 localhost osmo-bts-trx[25938]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb681b498 state LAPD_STATE_MF_EST)
> Dec 22 10:00:27 localhost osmo-bts-trx[25938]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb681b498 state LAPD_STATE_MF_EST)
> Dec 22 10:00:28 localhost osmo-bts-trx[25938]: #033[0;m<0011> lapd_core.c:1556 N(S) sequence error: N(S)=0, V(R)=1 (dl=0xb681b498 state LAPD_STATE_MF_EST)
> Dec 22 10:00:29 localhost osmo-bts-trx[25938]: #033[0;m#033[1;35m<0000> rsl.c:720 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN REL ACK
> Dec 22 10:05:01 localhost CRON[29783]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> Dec 22 10:11:08 localhost osmo-bts-trx[25938]: #033[0;m#033[1;35m<0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
> Dec 22 10:11:11 localhost osmo-bts-trx[25938]: #033[0;m#033[1;35m<0000> rsl.c:720 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN REL ACK
> Dec 22 10:15:01 localhost CRON[29858]: (root) CMD (/usr/lib/armbian/armbian-truncate-logs)
> Dec 22 10:15:02 localhost CRON[29859]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> Dec 22 10:17:01 localhost CRON[29877]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
> Dec 22 10:21:21 localhost osmo-bts-trx[25938]: #033[0;m#033[1;36m<0001> bts.c:250 Shutting down BTS 0, Reason No clock from osmo-trx
> Dec 22 10:21:24 localhost osmo-bts-trx[25938]: #033[0;mShutdown timer expired
> Dec 22 10:21:24 localhost osmo-bts-trx[25938]: ((*))
> Dec 22 10:21:24 localhost osmo-bts-trx[25938]:   |
> Dec 22 10:21:24 localhost osmo-bts-trx[25938]:  / \ OsmoBTS
> Dec 22 10:21:24 localhost systemd[1]: osmo-bts-trx.service: Main process exited, code=exited, status=42/n/a
> Dec 22 10:21:24 localhost systemd[1]: osmo-bts-trx.service: Unit entered failed state.
> Dec 22 10:21:24 localhost systemd[1]: osmo-bts-trx.service: Failed with result 'exit-code'.
> Dec 22 10:21:26 localhost systemd[1]: osmo-bts-trx.service: Service hold-off time over, scheduling restart.
> Dec 22 10:21:26 localhost systemd[1]: Stopped Osmocom osmo-bts for osmo-trx.
> Dec 22 10:21:26 localhost systemd[1]: Started Osmocom osmo-bts for osmo-trx.
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: <0017> control_if.c:911 CTRL at 127.0.0.1 4238
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m<0010> telnet_interface.c:104 Available via telnet 127.0.0.1 4241
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m<0012> input/ipaccess.c:882 enabling ipaccess BTS mode, OML connecting to 127.0.0.1:3002
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:754 Open transceiver for phy0.0
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m<0012> input/ipa.c:128 127.0.0.1:3002 connection done
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m<0012> input/ipaccess.c:705 received ID get from 1801/0/0
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m#033[1;36m<0001> oml.c:229 O&M Get Attributes [0], Manufacturer Dependent State is unsupported by TRX.
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[0] (150 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[1] (180 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[2] (180 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[3] (1680 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[4] (520 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[5] (165 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m#033[1;36m<0001> oml.c:681 Ignoring T200[6] (1680 ms) as sent by BSC due to suspected LAPDm bug!
> Dec 22 10:21:26 localhost osmo-bts-trx[29913]: #033[0;m#033[1;36m<0001> oml.c:1055 ADM state already was Unlocked
> Dec 22 10:21:27 localhost osmo-bts-trx[29913]: #033[0;m<0012> input/ipa.c:128 127.0.0.1:3003 connection done
> Dec 22 10:21:27 localhost osmo-bts-trx[29913]: #033[0;m<0012> input/ipaccess.c:705 received ID get from 1801/0/0
> Dec 22 10:21:28 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:30 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:32 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:34 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:36 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:38 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:40 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:42 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:44 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:46 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:48 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:50 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:52 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:54 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:56 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:21:58 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:00 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:02 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:04 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:06 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:08 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:10 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:11 localhost dhclient[5395]: DHCPREQUEST of 192.168.1.170 on eth0 to 192.168.1.1 port 67
> Dec 22 10:22:11 localhost dhclient[5395]: DHCPACK of 192.168.1.170 from 192.168.1.1
> Dec 22 10:22:11 localhost NetworkManager[789]: <info>  [1545470531.2789] dhcp4 (eth0):   address 192.168.1.170
> Dec 22 10:22:11 localhost NetworkManager[789]: <info>  [1545470531.2791] dhcp4 (eth0):   plen 24 (255.255.255.0)
> Dec 22 10:22:11 localhost NetworkManager[789]: <info>  [1545470531.2792] dhcp4 (eth0):   gateway 192.168.1.1
> Dec 22 10:22:11 localhost NetworkManager[789]: <info>  [1545470531.2793] dhcp4 (eth0):   server identifier 192.168.1.1
> Dec 22 10:22:11 localhost NetworkManager[789]: <info>  [1545470531.2793] dhcp4 (eth0):   lease time 3600
> Dec 22 10:22:11 localhost NetworkManager[789]: <info>  [1545470531.2794] dhcp4 (eth0):   hostname 'orangepizero'
> Dec 22 10:22:11 localhost NetworkManager[789]: <info>  [1545470531.2795] dhcp4 (eth0):   nameserver '192.168.1.1'
> Dec 22 10:22:11 localhost dbus[779]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
> Dec 22 10:22:11 localhost NetworkManager[789]: <info>  [1545470531.2796] dhcp4 (eth0):   domain name 'lan'
> Dec 22 10:22:11 localhost NetworkManager[789]: <info>  [1545470531.2797] dhcp4 (eth0): state changed bound -> bound
> Dec 22 10:22:11 localhost systemd[1]: Starting Network Manager Script Dispatcher Service...
> Dec 22 10:22:11 localhost dhclient[5395]: bound to 192.168.1.170 -- renewal in 1598 seconds.
> Dec 22 10:22:11 localhost dbus[779]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
> Dec 22 10:22:11 localhost systemd[1]: Started Network Manager Script Dispatcher Service.
> Dec 22 10:22:11 localhost nm-dispatcher: req:1 'dhcp4-change' [eth0]: new request (2 scripts)
> Dec 22 10:22:11 localhost nm-dispatcher: req:1 'dhcp4-change' [eth0]: start running ordered scripts...
> Dec 22 10:22:12 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:14 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:16 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:18 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:20 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> Dec 22 10:22:22 localhost osmo-bts-trx[29913]: #033[0;m#033[1;33m<000b> trx_if.c:178 No satisfactory response from transceiver for phy0.0 (CMD POWEROFF)
> 
> 
> Then, as I restart trx-uhd again, ( by kbd uparrow <cr> ) carrier comes back and phone registers within a few seconds.....
> 
> 
> 
> I see nothing upsetting in syslog that could explain cpu outage, besides, this is a 4-core Orange Pi Zero, I do not however know what quirks 
> 
> in Armbian Linux could lock out processing, when total load is 95% or so out of 400% ( i.e. 75% idle )
> 
> If there is anything I could do to bring clarity, please suggest....
> 
> Regards,
> 
> Gullik
> 
> 
> 
>> On 2018-12-22 10:42, Harald Welte wrote:
>>> On Sat, Dec 22, 2018 at 10:00:50AM +0100, Gullik Webjorn wrote:
>>> At 3 am the trx stopped again. This time it exited (itself) after logging
>>> large amounts of packet loss,
>> The interesting question is: Was there some kind of cron job or other activity
>> running at 3am on that system, which could cause a system load high enough to
>> make the flow between B100, kernel USB stack, libusb, UHD and osmo-trx-uhd
>> interrupt?
> I will investigate as I possibly can
>> 
>> Something like this is likely the root cause of the problem.
>> 
>> Sure, osmo-trx could "plaster around" it by having a more elegant recovery
>> mechanism, but failing fast due to exit and letting osmo-trx-uhd respawn
>> (normally executed via systemd) isn't actually all too bad.
>> 
>> What's definitely a real problem that needs immediate fixing is if we
>> somehow get stack with osmo-trx continuing to run, but failing to
>> transmit a valid signal wihout exit + respawn.
> That HAS happened for me, with the screen filled with logging of packet loss, pointing at line 1319 in 
> uhd_device::recv_async_msg()
> 
>> 
>> Regards,
>> 	Harald
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osmocom.org/pipermail/openbsc/attachments/20181222/01addbc7/attachment.htm>


More information about the OpenBSC mailing list