This is merely a historical archive of years 2008-2021, before the migration to mailman3.
A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/OpenBSC@lists.osmocom.org/.
Harald Welte laforge at gnumonks.orgHi Neels, On Fri, Jun 23, 2017 at 04:51:07AM +0200, Neels Hofmeyr wrote: > We're still having massive stability problems with osmo-bts-trx on the osmo-gsm-tester. I'm sorry, but I have to ask for more specifics: What exactly is a 'massive stability problem'? How does it manifest itself in detail at the lowest possible interface (i.e. log output of osmo-trx, osmo-bts-trx, ...)? > I have run a tcpdump on the ntp port for the past days, and nothing is doing > ntp besides the actual ntp service. And that service was presumably disabled (before your test described in the next paragraph)? > Today I started ntp while an osmo-bts-trx run was active and what do you know, > the osmo-bts-trx process exits immediately. I think this is bad, osmo-bts-trx > shouldn't use wall clock time for precise timing needs. Yes, I think it's a sign of very poor design if we cannot even sync the local wall clock to a NTP or GPS reference. CLOCK_MONOTONIC_RAW should be used on Linux for use cases like the one in osmo-bts-trx, having to schedule bursts at specific time intervals. In fact, I think the entire TRX<->BTS interface is not all that good an idea to begin with. In OsmoTRX, we have the ADC/DAC sample clock that is driving transmission of samples. Normally, the entire PHY layer runs synchronous to that, and it would drive the "clock" of L2 by means of PH-RTS.ind, so the L2 knows whenever it wants to transmit something. However, the OsmoTRX <-> osmo-bts-trx interface is not at the PHY<->L2 boundary, but it is at an inner boundary between the radio modem (OsmoTRX) and the L1 (in osmo-bts-trx). And those are two separate processes, without any way to synchronously trigger some action based on the ADC/DAC master sample clock. As a result, osmo-bts-trx needs to keep its own clock, based on whatever clock source available in the operating system / hardware, and make sure it sends bursts at the right speed to OsmoTRX. So OsmoTRX and osmo-bts-trx run actually asynchronous, at something that is specified/designed to be a synchronous interface in the GSM architecture. But then, I guess we don't have the luxury of changing all of this, so migrating to something like CLOCK_MONOTONIC_RAW or CLOCK_MONOTONIC. Instead of osmocom timers, using timer_create(CLOCK_MONOTONIC, ..)) sounds like a good idea, or even timerfd_create() which would integrate with our select() loop. Problem is only that those are about periodic timers. While we do want periodicity (once every burst period of 577us), the local clock of the Linux system is >= 1000 times less accurate than the clock of the GSM transmitting hardware, i.e. we need to adjust the expiration of our timer based on clock information provided by osmo-trx. > Besides that, I have no idea what could cause the clock skews, except maybe > that the CPU or the USB are not fast enough?? where is evidence of that? * do we get underruns / overruns in reading/writing from/to the SDR? ** if this is not properly logged yet, we should make sure all such instances are properly logged, and that we have a counter that counts such events since the process start. Printing of related counters could be done at time of sending a signal to the process, or in periodic intervals (every 10s?) on stdout * do we see indications of packet loss between TRX and BTS? ** each UDP on the per-TRX data interface contains frame number and timeslot index in its header, so detecting missing frames is easy, whether or not this is currently already implemented. Regards, Harald -- - Harald Welte <laforge at gnumonks.org> http://laforge.gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6)