Closing this thread.

On 2019-02-05 19:03, Harald Welte wrote:
It may be some particularly nasty driver on your platform.  It may be
thermal throttling due to insufficient cooling of the CPU, ...


Regards,
	Harald
Well, now the puzzle is solved for this platform. The Orange Pi Zero in conjunction
with Armbian has a cpu temperature governor. Reducing the max cpu speed down
to 816 Mhz, the cpu utilization goes up, to 200% ( out of 400 ). This lowers the
temperature by c:a 10 C. When the cpu temperature exceeds some value, the
governor downshifts clock, and execution of Transceiver is disturbed, and I am now convinced
this is the cause of the problem with this platform.

It does not make complete sense from a mathematical point of view, but I do not have the
insight into the inner workings of the "governor".  Reducing the maximum rate and increasing
the minimum rate I have reached stability over night. A heat sink has also been added to the cpu.
Using armbianmonitor -m, the critical temperature seems to be c:a 65 C.

MY earlier observations is that the bug was hit "randomly" , sometimes after 30 seconds, sometimes
not for several hours. This randomness is most probably due to room temperature, drafts from entering
or exiting the lab and air flow around the system.

With regards to osmo-bts software, the changes I propose are:

1)    ALARM LOG in

void LMSDevice::update_stream_stats(size_t chan, bool * underrun, bool * overrun)
{
    lms_stream_status_t status;
    if (LMS_GetStreamStatus(&m_lms_stream_rx[chan], &status) == 0) {
        if (status.underrun > m_last_rx_underruns[chan])
            *underrun = true;
        m_last_rx_underruns[chan] = status.underrun;

        if (status.overrun > m_last_rx_overruns[chan])
            *overrun = true;
        m_last_rx_overruns[chan] = status.overrun;
// if the radio drops packets it is good information to know it, this is a FATAL condition with regards to stable operation of Transceiver this + 4 more lines
        if (status.droppedPackets != 0) {
            dropped = dropped + status.droppedPackets;
            LOGC(DDEV, ALERT) << "Dropped " << status.droppedPackets << " Total dropped " << dropped;
        }
    }
}

2) Possibly causing this condition to either cause Transceiver exit or controlled restart ( which happens anyway, but many seconds later )

Now on to "balancing LimeSDR up/down links"......

Regards,

Gullik