Limesdr mini on Orange Pi Zero

historical

Ok, some more experiments.
I have made a small table that logs Linux time diffs in nanoseconds each 
time
LMSDevice is called. From my first test this indicates that on this 
particular
platform, within the last 8 calls time can be as low as 170 uS, i.e. a value
of roughly 170000. But I also get times up to 44625499, i.e. 4.46 mS,
and the values in the table can either look like:

$2 = {32345376, 28481917, 16771791, 15794875, 16805792, 17252958, 
44625499, 33037584}

indicating several calls after one another had long times

$4 = {198750, 179625, 33702624, 16127416, 27990666, 16007875, 13552168, 
16100124}

or the latest and 2'nd latest are low, but follow a sequence of long times.

Thus, we are not dealing with a single interruption of short latency, 
but an extended period of
long latency / interference. Once the condition occurs, I get 100's of 
logs of time mismatch,
so, it does not recover. Right now I am wondering about fault recovery, 
i.e. what should
the trx do once it has detected missing data? Whatever happens has a low 
chance of
fixing the situation, once triggered, the condition persists. This is 
also indicated by the
fact that the logged "diff" value is the *same* value in subsequent 
loggings, i.e. the
trx does not recover / rewind / adjust timing to get back to normal.

  Are you running the osmo-trx process with real-time priority (SCHED_RR)?
I tried that with no obvious effect.....
> What is the CPU load?  Please note on a multi-core system the
> interesting bit is not the average load over all CPU cores, but the
> maximum load of any one of the cores.
"Normal" load is trx process taking 80 - 100 % out of 4 cpus, i.e. htop 
shows
4 cpus each with 20-25% load. trx seems to spread its threads over all 
cpus.
> Correct. This is a problem we've been observing on a variety of
> platforms for quite some time.  Some samples are lost.
>
> * maybe the polling interval (bInterval) in the endpoint descriptors is
>    set too low?
Hmm, my crude measurements indicate trx retrieving is cause, not lack of 
data.
> * maybe the number / size of bulk-in USB transfers (URBs) is
>    insufficient and/or thery are not re-submitted fast enough.
> * maybe there's some other process using too much cpu / pre-empting
>    osmo-trx?
Yes it looks like that

> Your test seem to be looking at the second part. You can use a
> CLOCK_MONOTONIC time source to take timestamps, as you indicated.
I used
>         clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start_time);
Maybe I should refine my test....

Thanx for your comments,
Gullik