On Tue, Apr 29, 2014 at 12:12:09AM -0400, Thomas Tsou wrote:
Hi,
Add a separate, faster convolution decoding
implementation for rates
up to N=4 and constraint lengths of K=5 and K=7, which covers the
most GSM code uses. The decoding algorithm exploits the symmetric
structure of the Viterbi add-compare-select (ACS) operation - commonly
known as the ACS butterfly. This shift-register optimization can be
found in the well-known text by Dave Forney.
I am not knowledgable enough to comment on the actual viterbi things
so I will focus on the things around it.
+/* Aligned Memory Allocator
+ * SSE requires 16-byte memory alignment. We store relevant trellis values
+ * (accumulated sums, outputs, and path decisions) as 16 bit signed integers
+ * so the allocated memory is casted as such.
+ */
+#define SSE_ALIGN 16
+
+static int16_t *vdec_malloc(size_t n)
+{
+#ifdef HAVE_SSE3
+ return (int16_t *) memalign(SSE_ALIGN, sizeof(int16_t) * n);
+#else
+ return (int16_t *) malloc(sizeof(int16_t) * n);
+#endif
+}
argh, it would be nice if you could use talloc here but then we would
need to play games and align pointers ourselves. Maybe change the API
to at least have a 'ctx' similar to other talloc API?
+static void free_trellis(struct vtrellis *trellis)
+{
+ if (!trellis)
+ return;
+
+ free(trellis->vals);
+ free(trellis->outputs);
+ free(trellis->sums);
+ free(trellis);
Can you use talloc here?
+ _traceback_rec(dec, state, out, len);
_ is reserved for the system. We might want to avoid using that.
cheers
holger