Fwd: DSP optimization

This is merely a historical archive of years 2008-2021, before the migration to mailman3.

A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/OpenBSC@lists.osmocom.org/.

Alexander Chemeris alexander.chemeris at gmail.com
Tue Jul 9 14:46:16 UTC 2013


Hi all,

I'm moving this conversation from private to public. I think Sylvain
and Andreas might be interested in participating.

---------- Forwarded message ----------
From: Thomas Tsou <tom at tsou.cc>
Date: Tue, Jul 9, 2013 at 2:51 AM
Subject: Re: DSP optimization
To: Alexander Chemeris <alexander.chemeris at gmail.com>


On Mon, Jul 8, 2013 at 6:52 AM, Alexander Chemeris
<alexander.chemeris at gmail.com> wrote:
> Wow, optimization of 5-16x for Viterbi is huge indeed. I wonder what
> would be results for our Atoms.

Without SSE, just C only butterfly, the improvement is around 4x. SSE
3 (Atom) forces a small change on the normalization (not separated out
yet), but the results weren't very far off from SSE 4.1 when I tested
on Core 2 Duo.

I might try to manipulate the interface to read in the state tables
instead of the generator polynomials. That would really help with
testing and integration, but I'm not sure yet. There are many ways to
go here.

> What is problematic with the runtime detection? CPU autodetection on
> Linux should be as easy as reading /proc/cpuinfo. But I see an issue
> is with correctly setting up build system to generate all version on
> the same run. I think we could leave CPU autodetection for the
> "everything else" milestone, using compile time selection for now.

I think compile time detection is more appropriate. For GSM / LTE
we're almost always dealing with fixed sized vectors and not odd
calculations (e.g. 1023 size FFT), so it's unlikely that the results
will change on repeated runs.

/proc/cpuinfo parsing scripts I've seen have been prone to breakage.
If you have a really good one, let me know. I usually prefer to run
configure checks against the actual instruction, but that can get
messy with a lot of checks. Anyhow, I'm not worrying about this now.

> What repository will you push at? We need to have at least master
> branch and dual-channel branch working with the optimizations. And I
> believe everyone would be happy to see optimizations in the
> libosmocore for the benefit of other projects as well. I don't foresee
> any issues with a slight change in the API of libosmocore if it is
> justified - just send an RFC/patch to the OpenBSC mailing list and it
> will be reviewed.

Non-Viterbi changes are sigProc.cpp changes only, so they are not
branch-specific - they will probably merge into the oldest available
OpenBTS releases. The Viterbi changes merge into Andreas's branch,
which is a very large change. For now, somebody needs to write it,
which is why I'm considering making the interfaces match.

Attached are the standalone unit test cases for SSE 4.2. As previously
mentioned, Atom needs SSE3 only. I'll add the ifdefs for those
shortly. I don't know if there's an appropriate repository for these
right now - linking libosmocore from the transceiver for comparison
purposes only seems silly. I just generated a temporary tarball for
the time being.

  Thomas


--
Regards,
Alexander Chemeris.
CEO, Fairwaves LLC / ООО УмРадио
http://fairwaves.ru
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sse-tests-0.1.tar.gz
Type: application/x-gzip
Size: 94422 bytes
Desc: not available
URL: <http://lists.osmocom.org/pipermail/openbsc/attachments/20130709/0aff2837/attachment.bin>


More information about the OpenBSC mailing list