[PATCH 1/4] core/conv: Add optimized Viterbi decoding

historical

On Wed, May 7, 2014 at 12:25 PM, Sylvain Munaut <246tnt at gmail.com> wrote:
> Some quick things I saw from a quick glance though:
>  - Some things that should be 'static' are not and they pollute the
> global namespace

Indeed. Internal calls in viterbi_gen.c should be static with leading
underscores stripped since that seems to be the preferred style. If
you were referring to others, please let me know.

>  - Anything that is in the global namespace needs the osmo_ (or even
> osmo_conv_ here) prefix

Ok. I purposely kept the internal struct definitions out of the public
interface, but prefixing and exposing is fine.

>  - Did you check (and test in make test) that it works for all codes
> that matches
>    "if ((code->N <= 4) && ((code->K == 5) || (code->K == 7)))" ?

Almost all codes used by osmo-bts and a few others are covered by make
check. Rates N=5 are skipped and TCH/HR passes with the posted
puncturing fix for osmo-bts. There is an additional set of out-of-tree
tests for Guassian noise, benchmarks, and some other things that
probably don't apply to libosmocore.

https://github.com/ttsou/osmo-conv-test

>    Also, to avoid exporting those internal symbols, I'm wondering if
> #including the viterbi_gen.c from viterbi.c instead of compiling
> separately wouldn't be better. But that's just a random idea ...

I see three approaches:

1. Single file for generic and architecture specific implementations
with large ifdef blocks.
2. Separate files with exported symbols (current patch).
3. Separate files with restricted visibility with
__attribute__((__visibility__("hidden"))) or other combinations with
the gcc -fvisibility flag.

>> A beneficial change would be to modify the API to allow persistent
>> decoder objects instead of performing an allocation and tear-down on
>> every decoding call. From a performance standpoint, the current
>> implementation is unnecessarily restricted by memory access and page
>> faults for that reason.
>
> You can't change the existing API without breaking existing stuff, but you can
> add a new all-in-one function that allows to give a persistent decoder object.

I was just noting performance observations. For example, the SSE K=5
case on my machine is memory bound and not limited by compute
resources. Given the effort to squeeze the current version behind the
existing API, I won't be making breaking changes anytime soon.

 -TT