Hi Gullik,
On Wed, Jan 23, 2019 at 12:21:49AM +0100, Gullik Webjorn wrote:
I have followed your suggestions, and rebuilt
--with-neon-vfpv4 , and I have
enabled debugging.
Are you running the osmo-trx process with real-time priority (SCHED_RR)?
What is the CPU load? Please note on a multi-core system the
interesting bit is not the average load over all CPU cores, but the
maximum load of any one of the cores.
The length seems always correct, since I do not get
that log entry,
but rather that the time has "slipped", i.e. the LMS api has not
delivered anything for "diff" time, or the timestamp received has
"jumped" in the Lime.
[...]
This indicates to me that this specific arm cpu in combination with limesdr
mini and the software "drops data". I will gladly try to debug or narrow
this down, but ask for some suggestions on how to proceed.
Correct. This is a problem we've been observing on a variety of
platforms for quite some time. Some samples are lost.
* maybe the polling interval (bInterval) in the endpoint descriptors is
set too low?
* maybe the number / size of bulk-in USB transfers (URBs) is
insufficient and/or thery are not re-submitted fast enough.
* maybe there's some other process using too much cpu / pre-empting
osmo-trx?
* maybe there's some [buggy?] driver used on this system that
disables/masks interrupts or otherwise causes high scheduler
latencies, by disabling pre-emption or the like?
* maybe there's some bios/firmware/management-mode code that can
interrupt normal OS processing without the OS even knowing about it.
* maybe there's some power management (cpu speed throttling, thermal throttling, ...)
interfering?
I'm not familiar with the inner workings of LimeSuite, but any program
that would expect to achieve high performance on libusb should (IMHO) be
using the asynchronous API of libusb, and it should make sure there are
always multiple URBs submitted at any given point in time, so that the
kernel can handle data from the USB device without interruption.
IF I read LimeSuite correctly, they are submitting 16 URBs for read
(USB_MAX_CONTEXTS returned by GetBuffersCount() used by
ReceivePacketsLoop() which in turn calls the somewhat interestingly
named method dataPort->BeginDataReading() for each of the buffers.
https://elinux.org/images/c/c8/Debugging_Methodologies_for_Realtime_Issues_…
is a good introduction, it may be a bit dated.
One thought is to save a timestamp each time through
readSamples and
compare to some constant to determine that the problem is NOT that we
are unable to read as fast as required. reading 2500 samples would
take 2.3 mS if I understand this, so we need to cal readsamples at about this rate....
The data can be lost (at least)
* between the USB device and the USB host, if the bus is overloaded or
somehow the kernel / hardware cannot handle/schedule transfers fast enough, or
* between kernel and userspace
Your test seem to be looking at the second part. You can use a
CLOCK_MONOTONIC time source to take timestamps, as you indicated.
Possible causes could be something *else* locking out
the program.
yes. That's why in the normal mode of operation you usuall start
with running osmo-trx with SCHED_RR / realtime prirority. This way
normal tasks that run with regular priority are not going to interfere
anymore. But then, that leaves tons of kernel/driver code, and
hardware/bios/firmware...
It would be great to sched more light on this, but it likely needs very thorough
analysis across all layers of a system.
--
- Harald Welte <laforge(a)gnumonks.org>
http://laforge.gnumonks.org/
============================================================================
"Privacy in residential applications is a desirable marketing option."
(ETSI EN 300 175-7 Ch. A6)