Findings so far:
I changed timing the RXlower thread, measuring the time it takes from
exiting the
readSamples function in Transceiver, to the entry just before the call
to the LimeSuite,
and these are the 32 last values in uSeconds :
$1 = {236, 190, 267, 4226, 4173, 5822, 3881, 56079, 7598, 7085, 4959,
7832, 7808, 7772, 5079, 222, 199, 222, 202, 201, 191, 209, 210, 198,
211, 188, 203, 175, 239, 177, 215,
215} ( the index is 15 )
The 3 first samples, and the 17 last, shows that it is typically
possible to execute all code in RXlower ( not including recvStream )
in about 200 uSeconds.
There are 12 values in sequence beginning with 4.226 mS and including
56.079 mS, where "nonblocking" code is interrupted by
other activity in the Orange Pi Zero / Armbian 9. Whatever this is, it
runs at greater priority than -19, and causes the FIFO on
board the LimeSDR mini to fill, until we get a report of "dropped by HW".
From this point the error is not recoverable, and can only be cleared
by Transceiver restarting ( with new timestamps ).
The conclusion is that there is no problem with Transceiver ( except for
detection of this condition) and as long as the Transceiver
process gets enough cpu, all works as expected. However "something runs
for 122 mS" severely degrading the latency. There IS
processing time available for Transceiver, but the reduction in
available cpu cycles is to severe for proper operation.
The 2500 bytes read corresponds to 625 bits, i.e. 4 slots, for a total
of 2.307 mS, which is the rate we must keep on average
to keep the LimeSDR happy, we can tolerate several "misses", since we
can complete the loop in c:a 200 uS, but if the condition
persists to long, the LimeSDR will overwrite it's FIFO, log dropped
packets, and Transceiver will hit error #3339 or exit.
IF Transceiver *always* got about 200 uS of running time each 2.3 mS we
would be fine.....
No on to finding the culprit....or just change HW / OS.....this combo
has lousy RT characteristics.....
Regards,
Gullik
On 2019-01-29 12:20, Gullik Webjorn wrote:
I have tried to investigate *where* the type #3339 bug
occurs. I was
thinking in terms either
Lime hw / fw / driver drops packets, or something screws up the
timestamp leading to the
perception of lost data.
I have added some debug printouts, where LMS_GetStreamStatus is
called, and as the error occurs,
I also get dropped packets. The explanation from the API is that
droppedPackets = "Number of dropped packets by HW."
To me this indicates that the program / usb driver / usb is not
emptying the on board fifo frequently enough,
and that this is the cause of the problem ( as Harald indicated he
suspected )
At least I have convinced myself that:
1 Packets are lost within the Lime board ( since it reports them)
and not in the higher levels of code.
2 That it is indeed "lowlevel" i.e. Lime driver or usb driver bug,
or just the fact we do not get time to process quick enough,
which I feel we should have plenty of cpu to compensate for.
Gullik
And yes, we should not confuse bits and symbols :-)