This is merely a historical archive of years 2008-2021, before the migration to mailman3.
A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/OpenBSC@lists.osmocom.org/.
Harald Welte laforge at gnumonks.orgHi Gullik, On Wed, Jan 23, 2019 at 12:21:49AM +0100, Gullik Webjorn wrote: > I have followed your suggestions, and rebuilt --with-neon-vfpv4 , and I have > enabled debugging. Are you running the osmo-trx process with real-time priority (SCHED_RR)? What is the CPU load? Please note on a multi-core system the interesting bit is not the average load over all CPU cores, but the maximum load of any one of the cores. > The length seems always correct, since I do not get that log entry, > but rather that the time has "slipped", i.e. the LMS api has not > delivered anything for "diff" time, or the timestamp received has > "jumped" in the Lime. > [...] > This indicates to me that this specific arm cpu in combination with limesdr > mini and the software "drops data". I will gladly try to debug or narrow > this down, but ask for some suggestions on how to proceed. Correct. This is a problem we've been observing on a variety of platforms for quite some time. Some samples are lost. * maybe the polling interval (bInterval) in the endpoint descriptors is set too low? * maybe the number / size of bulk-in USB transfers (URBs) is insufficient and/or thery are not re-submitted fast enough. * maybe there's some other process using too much cpu / pre-empting osmo-trx? * maybe there's some [buggy?] driver used on this system that disables/masks interrupts or otherwise causes high scheduler latencies, by disabling pre-emption or the like? * maybe there's some bios/firmware/management-mode code that can interrupt normal OS processing without the OS even knowing about it. * maybe there's some power management (cpu speed throttling, thermal throttling, ...) interfering? I'm not familiar with the inner workings of LimeSuite, but any program that would expect to achieve high performance on libusb should (IMHO) be using the asynchronous API of libusb, and it should make sure there are always multiple URBs submitted at any given point in time, so that the kernel can handle data from the USB device without interruption. IF I read LimeSuite correctly, they are submitting 16 URBs for read (USB_MAX_CONTEXTS returned by GetBuffersCount() used by ReceivePacketsLoop() which in turn calls the somewhat interestingly named method dataPort->BeginDataReading() for each of the buffers. https://elinux.org/images/c/c8/Debugging_Methodologies_for_Realtime_Issues_in_Linux_Systems.pdf is a good introduction, it may be a bit dated. > One thought is to save a timestamp each time through readSamples and > compare to some constant to determine that the problem is NOT that we > are unable to read as fast as required. reading 2500 samples would > take 2.3 mS if I understand this, so we need to cal readsamples at about this rate.... The data can be lost (at least) * between the USB device and the USB host, if the bus is overloaded or somehow the kernel / hardware cannot handle/schedule transfers fast enough, or * between kernel and userspace Your test seem to be looking at the second part. You can use a CLOCK_MONOTONIC time source to take timestamps, as you indicated. > Possible causes could be something *else* locking out the program. yes. That's why in the normal mode of operation you usuall start with running osmo-trx with SCHED_RR / realtime prirority. This way normal tasks that run with regular priority are not going to interfere anymore. But then, that leaves tons of kernel/driver code, and hardware/bios/firmware... It would be great to sched more light on this, but it likely needs very thorough analysis across all layers of a system. -- - Harald Welte <laforge at gnumonks.org> http://laforge.gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6)