Hi Tom,
- Some things
that should be 'static' are not and they pollute the
global namespace
Indeed. Internal calls in viterbi_gen.c should be static with leading
underscores stripped since that seems to be the preferred style. If
you were referring to others, please let me know.
I don't mind the leading underscore honestly, especially in static, but if
Holger prefers it that way, it's fine with me too.
When I do a full review of the patch, I will look in more details into
each call.
- Anything
that is in the global namespace needs the osmo_ (or even
osmo_conv_ here) prefix
Ok. I purposely kept the internal struct definitions out of the public
interface, but prefixing and exposing is fine.
Not what I meant. 'structs' don't end up in the global namespace as long
as they're not used in global function signatures, so you can keep them as
is. I was thinking about functions.
Among all those not marked 'static' and don't have an osmo_ prefix, I
didn't
yet do an exhaustive check to see what was an oversight and what's really
needed to be global.
- Did you
check (and test in make test) that it works for all codes
that matches
"if ((code->N <= 4) && ((code->K == 5) || (code->K ==
7)))" ?
Almost all codes used by osmo-bts and a few others are covered by make
check. Rates N=5 are skipped and TCH/HR passes with the posted
puncturing fix for osmo-bts.
I was more thinking about random codes. Like if someone invents a random
code with N=4 and K=5, will it work ? When implementing it, you didnt see
any other limitations that those right ?
Also, to
avoid exporting those internal symbols, I'm wondering if
#including the viterbi_gen.c from viterbi.c instead of compiling
separately wouldn't be better. But that's just a random idea ...
I see three approaches:
1. Single file for generic and architecture specific implementations
with large ifdef blocks.
2. Separate files with exported symbols (current patch).
3. Separate files with restricted visibility with
__attribute__((__visibility__("hidden"))) or other combinations with
the gcc -fvisibility flag.
I think we'll need to play with 3. Still probably need a prefix (not
the full "osmo_conv_", but something less conflicty thatn gen_)
I was just noting performance observations. For
example, the SSE K=5
case on my machine is memory bound and not limited by compute
resources. Given the effort to squeeze the current version behind the
existing API, I won't be making breaking changes anytime soon.
When I designed the API, I did do benchmark and just repeated them, and
there is virtually no difference between the two versions. ( < 1% )
But I guess that's because in the current version, the 'decoder' struct
stores virtually nothing that can be re-used between different bursts.
(and malloc itself is pretty fast nowadays)
In your case there is the treillis derived from the 'code' struct. And I
guess this is where the performance hits comes from. An option would be
to have an internal 'cache' of codes that have been used in the past
so those data would be re-used.
Basic flow of a decoder_init would be:
{
decoder = calloc(sizeof(decoder));
treillis = get_from_cache(code)
if (!treillis)
treillis = put_in_cache(gen_treillis(code));
decoder->treillis = treillis;
}
Since the number of different codes used in a single software is not
big, this should be
fairly easy to implement.
Cheers,
Sylvain