On 21 Jan 2016, at 17:16, suraev@alumni.ntnu.no wrote:
-void osmo_revbytebits_buf(uint8_t *buf, int len); +void osmo_revbytebits_buf(uint8_t *buf, unsigned int len);
yes that makes sense but please do not mix bugfix and API change in one (unless the API change is the bugfix). E.g. 2/3 of the change is "noise"
@@ -221,8 +220,7 @@ void osmo_revbytebits_buf(uint8_t *buf, int len) }
for (i = unaligned_cnt; i + 3 < len; i += 4) {
uint32_t *cur = (uint32_t *) (buf + i);*cur = osmo_revbytebits_32(*cur);
osmo_store32be(osmo_revbytebits_32(osmo_load32be(buf + i)), buf + i);
uint32_t cur; memcpy(&cur, buf + 1, sizeof(cur)); cur = osmo_revbytebits_32(cur); memcpy(buf + 1, &cur, sizeof(cur));
would be my approach. So let's compare it. Your code without the loop is here https://goo.gl/vD3kqQ and the memcpy variant is https://goo.gl/5kzTmx now if there would be a cloud microbenchmark we could resolve that once and for all.