On Wed, May 7, 2014 at 12:25 PM, Sylvain Munaut 246tnt@gmail.com wrote:
Some quick things I saw from a quick glance though:
- Some things that should be 'static' are not and they pollute the
global namespace
Indeed. Internal calls in viterbi_gen.c should be static with leading underscores stripped since that seems to be the preferred style. If you were referring to others, please let me know.
- Anything that is in the global namespace needs the osmo_ (or even
osmo_conv_ here) prefix
Ok. I purposely kept the internal struct definitions out of the public interface, but prefixing and exposing is fine.
- Did you check (and test in make test) that it works for all codes
that matches "if ((code->N <= 4) && ((code->K == 5) || (code->K == 7)))" ?
Almost all codes used by osmo-bts and a few others are covered by make check. Rates N=5 are skipped and TCH/HR passes with the posted puncturing fix for osmo-bts. There is an additional set of out-of-tree tests for Guassian noise, benchmarks, and some other things that probably don't apply to libosmocore.
https://github.com/ttsou/osmo-conv-test
Also, to avoid exporting those internal symbols, I'm wondering if #including the viterbi_gen.c from viterbi.c instead of compiling separately wouldn't be better. But that's just a random idea ...
I see three approaches:
1. Single file for generic and architecture specific implementations with large ifdef blocks. 2. Separate files with exported symbols (current patch). 3. Separate files with restricted visibility with __attribute__((__visibility__("hidden"))) or other combinations with the gcc -fvisibility flag.
A beneficial change would be to modify the API to allow persistent decoder objects instead of performing an allocation and tear-down on every decoding call. From a performance standpoint, the current implementation is unnecessarily restricted by memory access and page faults for that reason.
You can't change the existing API without breaking existing stuff, but you can add a new all-in-one function that allows to give a persistent decoder object.
I was just noting performance observations. For example, the SSE K=5 case on my machine is memory bound and not limited by compute resources. Given the effort to squeeze the current version behind the existing API, I won't be making breaking changes anytime soon.
-TT