On Wed, Jul 10, 2013 at 3:46 PM, Sylvain Munaut <246tnt(a)gmail.com> wrote:
Ideally the code should just end up in libosmocore,
(being compiled if
supported), and
just "work" with any code that previously worked, possibly fallbacking
to the reference
impl at runtime if something weird is not supported.
The optimized portions, PMU and BMU, can stand independently, and my
preferred approach would be to integrate those pieces with the
existing trellis parts intact. That assumes that the accumulated path
metrics, branch metrics, and path decisions are each represented by
contiguously allocated 16-bit values. The downside of such an approach
is that it would probably take far more time to integrate than the
time to actually create the optimizations.
The treillis scan is very similar in all cases. The
only thing to
support is the puncturing during the scan (I wanted to avoid having
the de-puncture first).
What is the benefit of this? I explicitly separated puncturing because
puncturing is inherently not part of the Viterbi algorithm.
Admittedly, though, I'm probably biased because for LTE I can remove
the puncturing unit entirely and directly replace it with a
rate-matching unit.
The SSE optimization operates vertically through the trellis, so
in-scan puncturing can be accommodated.
It also currently supports progressive decoding (i.e.
you can start
decoding a message before you have all of it if you want, you just
feed it chunk by chunk and it can resume where it left off).
If not enough bits exist to decode a message, why attempt to partially
decode it instead of waiting for the other bits? I could add support
for this; it's just not a use case that I ever encountered.
Most of the changes between variations come from the
way the
termination is handled but since that's always only a few symbols,
that could be left alone unoptimized (assuming the internal state is
compatible enough).
Right. I don't think termination cases (or tail cases in general) are
worth handling with any optimized code.
Thomas