Hello Holger,
After going through the patch again, I see that you instrument a single call of the decoding code? I don't think that this is a good approach to comparing the decoding code.
We will add more test samples for the profiling test as suggested and share the benchmark.
Regards Prasad