On Tue, Apr 29, 2014 at 12:12:09AM -0400, Thomas Tsou wrote:
Hi,
Add a separate, faster convolution decoding implementation for rates up to N=4 and constraint lengths of K=5 and K=7, which covers the most GSM code uses. The decoding algorithm exploits the symmetric structure of the Viterbi add-compare-select (ACS) operation - commonly known as the ACS butterfly. This shift-register optimization can be found in the well-known text by Dave Forney.
I am not knowledgable enough to comment on the actual viterbi things so I will focus on the things around it.
+/* Aligned Memory Allocator
SSE requires 16-byte memory alignment. We store relevant trellis values
(accumulated sums, outputs, and path decisions) as 16 bit signed integers
so the allocated memory is casted as such.- */
 +#define SSE_ALIGN 16
+static int16_t *vdec_malloc(size_t n) +{ +#ifdef HAVE_SSE3
- return (int16_t *) memalign(SSE_ALIGN, sizeof(int16_t) * n);
 +#else
- return (int16_t *) malloc(sizeof(int16_t) * n);
 +#endif +}
argh, it would be nice if you could use talloc here but then we would need to play games and align pointers ourselves. Maybe change the API to at least have a 'ctx' similar to other talloc API?
+static void free_trellis(struct vtrellis *trellis) +{
- if (!trellis)
 return;- free(trellis->vals);
 - free(trellis->outputs);
 - free(trellis->sums);
 - free(trellis);
 
Can you use talloc here?
_traceback_rec(dec, state, out, len);
_ is reserved for the system. We might want to avoid using that.
cheers
holger