Fixing power off in osmo-trx

This is merely a historical archive of years 2008-2021, before the migration to mailman3.

A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/OpenBSC@lists.osmocom.org/.

Alexander Chemeris alexander.chemeris at gmail.com
Sun Jul 14 20:01:08 UTC 2013


Hi Thomas,

I've made some progress in making osmo-trx able to power off/on
without restart. At this moment it's almost there - the only big
remaining issue I see is some bug with UHD device start/stop. It seems
we do not re-initialize it properly and assert (num_rd == OUTCHUNK)
fails. I hope you could fix that.

I completely re-wrote Thread class to make starting/stopping threads a
synchronous reliable operation. I also introduced shutdown() function
to FIFOServiceLoopThread, ControlServiceLoopThread and
TransmitPriorityQueueServiceLoopThread. This method takes care of
un-blocking a thread from a blocking read from a queue or a socket.
Few bugs in the other code is fixed as well. Details are below, please
review the patches.

Patch set is checked into https://github.com/fairwaves/osmo-trx under
achemeris/stable_threads and achemeris/poweroff_hack branches.

> commit c5da6607b4d17706380de72e0814e28f03624c51
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 14:57:14 2013 +0400
>
>     Transceiver52M: Fix crash in uhd_device destructor due to deleting statically allocated memory.

This is a genuine bug, should be fixed upstream.

> commit dbd27a60b6ed99fd6fd2339ffafccc0d759fa2fc
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sat Jul 13 21:26:39 2013 +0400
>
>     CommonLibs: Fix compile time warnings.

Fixes annoying warning.

> commit 8eaa40dd6c2a783306d809aec98c2937e15068f3
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 16:23:32 2013 +0400
>
>     Transceiver52M: Remove unused thread mAlignRadioServiceLoopThread;

This actually leads to crash when Thread class is abstract.

> commit b8646946526903d2bbfa657690936b9bcba80ca5
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sat Jul 13 22:29:12 2013 +0400
>
>     CommonLibs: DEBUG: Mark interthread classes which are not used in OsmoTRX.

I've added this comments just to simplify debugging. I.e. I don't want
to debug interthread primitives which we do not use.

> commit 69b6a6dfcd882df180e9d5e9856225f4b1d2eeed
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 01:43:04 2013 +0400
>
>     CommonLibs: Allow NULLs to be retrieved from InterthreadQueue.
>
>     We need a way to stop InterthreadQueue blocking read to be able to shutdown
>     a thread. The easiest way to do that is to push NULL to the queue, but the
>     original implementation will just ignore that and continue blocking. After
>     the change the blocking read() will exit with NULL result which is perfectly
>     fine with us.
>
>     Ideally we should change all methods of InterthreadQueue to return a status
>     value to indicate normal exits, timeouts, etc. Right now the only way to
>     indicate an error is returning NULL, which could be a valid operation.
>
> commit d4a8c1360e5d54188b648766dde199fa37f2ed27
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 02:00:00 2013 +0400
>
>     CommonLibs: Add shutdown() method to the DatagramSocket class.
>
>     This method is useful to abort a blocking operations on a socket during shutdown.
>
> commit 302a9198df85e4f771368f24d8445dec7c8b2203
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 12:04:44 2013 +0400
>
>     CommonLibs: Signal::wait() should return the value which pthread_cond_timedwait() returns.
>
> commit e279124660a2add686e6c9266d18a9bb0cf43258
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 12:38:25 2013 +0400
>
>     CommonLibs: Rewrite Threads class to be predictable during start/stop.
>
>     Current implementation strictly synchronize thread startup and shutdown
>     to make sure a thread is actually started on startThread() exit and
>     that it's stopped on stopThread() exit (or an error is returned).
>
>     I also changed the Thread class to be used as a parent class for
>     an actual thread. This way we could provide much better logical
>     separation between variables used by different threads. Which will
>     (hopefully) reduce number of issues with inter-thread communications.
>
> commit de202e343564913e0591de10850e617d7de55b6b
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 12:42:52 2013 +0400
>
>     Transceiver52M: Port all threads to the new Thread interface.
>
>     Also note that we introduced shutdown() method for Transceiver threads which
>     implement proper shutdown of threads when they are in blocking read state.
>     This involves using shutdown() on sockets and pushing NULL to queues.
>
>     With this change we should be able to start/stop transceiver channels at
>     arbitrary moments.
>
> commit 129ad76b15f1fdf0588d5a24fe2d0f7ef3df274e
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 12:48:37 2013 +0400
>
>     Transceiver52M: Delay is no longer needed at the transceiver exit, threads should shut down cleanly.
>
> commit 604f65e69f2317dd5dc1aeb8faf3c107cd8a2231
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 13:22:31 2013 +0400
>
>     CommonLibs: Port InterthreadTest and SocketsTest to the new Thread API.
>
> commit c4038ef54c806c8c00f7fa28bc2b8b200bc3356f
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 13:24:16 2013 +0400
>
>     CommonLibs: Adding a new ThreadsTest testsuite.
>
>     It's very basic at this moment. We should add a stress-test for thread
>     start/stop at least.

This is the meat of the branch.
At this stage everything works fine, threads are shutdown properly,
but power off command still doesn't power off the transceiver.

> commit c0f0da4ec8e0068a562f427a89a899bf190bc74a
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 23:07:31 2013 +0400
>
>     Transceiver52M: HACK: Exit transceiver on POWEROFF command.
>
>     This hack allows OsmoBTS to correctly restart itself with different parameters.
>     We assume that transceiver is run as a service and will be restarted on exit.
>
> commit 806a64b1df40871543e9ae34386463d08a564147
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 22:58:55 2013 +0400
>
>     Transceiver52M: Start and stop all threads on POWERON/POWEROFF.
>
>     This change reveals some issue with the UHD start/stop code. Specifically,
>     we get a failed assert in RadioInterface::pullBuffer() (num_rd == OUTCHUNK
>     with num_rd=0) with the POWERON/POWEROFF/POWERON sequence.

These patches implements actual power off command. They are split into
a separate branch, because they lead to assert failure I mentioned
above. In my setup it's 100% reproducible. I hope you could fix that.

> commit 4c39dd4b492b439742911aa70623d41ad634d7f4
> Author: Alexander Chemeris <Alexander.Chemeris at gmail.com>
> Date:   Sun Jul 14 22:45:40 2013 +0400
>
>     Transceiver52M: Introduce usage counting for DriveLoop and RadioInterface.
>
>     The reason is to allow them to be started from any TRX. I.e. they are started
>     when the first TRX starts and stopped when the last TRX stops.

This hack is a workaround for the assert failure above - it shuts down
the complete transceiver in a hope that it will be restarted by
external means. Meant as to be used in production while we don't have
the assert failure fixed.

--
Regards,
Alexander Chemeris.
CEO, Fairwaves LLC / ООО УмРадио
http://fairwaves.ru




More information about the OpenBSC mailing list