Hi all,
I've been doing some profiling on osmo-bts recently (on sysmobts hardware, which has only a relatively slow ARM926 CPU core), and the two things that show up most are:
* msgb_alloc() -> talloc_zero() -> malloc this can be alleviated somewhat by using talloc pools. For some reason the pools don't remove all of the malloc() calls.
* vfprintf() and friends, from logp() statements. The sad part is that calls like gsm_lchan_name() are of course executed beefore the call into logp(), at which point the vfprintf/sprintf/... for arguments has already been executed, and only the last/final one hasn't happened yet.
Here we can do two things: Calls like gsm_lchan_name() don't need to happen all the time, as the lchan name is static and can be generated once at the time gsm_lchan is created. I implemented that in osmo-bts (and openbsc, as it's from gsm_data_shared).
The second idea would be to expand the LOGP() macro a bit in a way to ensure the the checking whether the log line is enabled _before_ the arguments (and thus associated function calls) are evaluated. Any ideas on that?
After a brief look at osmo-pcu profiling, it looks like in the attached picture. We cannot do much about the __copy_to_user_std, do_select and core_sys_select, as those are kernel side.
However, there again we see vfprintf and friends, mostly via gprs_rlcmac_tbf::name() - and of course the msgb_alloc() and msgb_free() going through talloc and finally malloc.
So the same strategies as above could (and probably should) be applied to osmo-pcu.
Regards, Harald