Hi all,
This patch set adds to libosmocore an optimized Viterbi decodeer for architecture specific (Intel SSE) and non-specific cases. The implementation covers codes with constraint lengths of K=5 and K=7 and rates 1/4 to 3/4, which make up the majority of GSM use cases. Speedup from the current implementation is in the range of 5 to 20 depending on the processor and code type. API is unchanged.
Tested on Haswell (i7-4770K) and Atom (D2550). Additional test codes from osmo-bts are included. Further tests for AWGN bit-error-rate and benchmarks can be found in the following repository.
https://github.com/ttsou/osmo-conv-test
Here are some examples.
Bit error test for GPRS CS2 with SNR of 5 dB and 100000 bursts.
$ ./conv_test -c 2 -e -r 5 -i 100000
================================================= [+] Testing: GPRS CS2 [.] Specs: (N=2, K=5, non-recursive, flushed, not punctured) [.] Input length : ret = 290 exp = 290 -> OK [.] Output length : ret = 588 exp = 588 -> OK
[.] BER tests: [..] Testing base: [..] Input BER.......................... 0.042443 [..] Output BER......................... 0.000006 [..] Output FER......................... 0.001350 (135) [..] Testing SIMD: [..] Input BER.......................... 0.042460 [..] Output BER......................... 0.000005 [..] Output FER......................... 0.001240 (124)
Timed AFS benchmark with 8 threads and 100000 bursts per thread.
$ ./conv_test -b -c 10 -j 8 -i 100000
================================================= [+] Testing: GSM TCH/AFS 6.7 [.] Specs: (N=4, K=5, recursive, flushed, punctured) [.] Input length : ret = 140 exp = 140 -> OK [.] Output length : ret = 448 exp = 448 -> OK
[.] Performance benchmark: [..] Encoding / Decoding 800000 bursts on 8 thread(s): [..] Testing base: [..] Elapsed time....................... 4.320001 secs [..] Rate............................... 25.925920 Mbps [..] Testing SIMD: [..] Elapsed time....................... 0.458272 secs [..] Rate............................... 244.396341 Mbps [..] Speedup............................ 9.426718
-TT