static strings -- little fun on the side - OpenBSC

27 Feb 2019


      As a bit of relaxation in-between the serious hacking, I gave a little idea of
mine a spin, which could improve our use of static string buffers.
My idea is to use one common static buffer string, as a ring buffer, for all
our osmo_xxx_name() functions, that returns new chunks of static string every
time. Why?
The point is that we have many osmo_xxx_name() functions that return a string
in one single static buffer. When you call it a second time, it overwrites the
buffer. So this doesn't work:
printf("PLMN changed from %s to %s\n", 
         osmo_plmn_name(old_plmn),
     osmo_plmn_name(new_plmn));
No matter what, this will always print the same string twice, and probably
no-one would notice.
For this, I have previously often invented xxx_name2() functions:
printf("PLMN changed from %s to %s\n",
         osmo_plmn_name(old_plmn),
     osmo_plmn_name2(new_plmn));
With osmo_plmn_name2() being:
"Same as osmo_plmn_name(), but returning in a different static buffer."
That is bad for these reasons:
- each new separate function adds another static buffer to the program size,
   which remains unused all the other time.
 - each new separate function adds more API signatures.
Alternatively we have xxx_buf() functions, but they are more cumbersome to use:
char buf[123];
  gsm0808_cell_id_list_name_buf(buf, sizeof(buf), cil);
  LOGP(DMSC, LOGL_DEBUG, "%s\n", buf);
Particularly for logging, that would always run even though the logging
category might be switched off.
So I came up with these patches, and changing all of current libosmocore's
static buffers to using this does work:
http://git.osmocom.org/libosmocore/log/?h=neels/static_strings
(second last patch introduces the API, last patch changes all libosmocore callers)
Also pushed to gerrit for easier review:
https://gerrit.osmocom.org/13061
the benefits:
- All osmo_xxx_name() functions can be called N times in a single printf()
  statement, without the need for managing buffers.
- No osmo_xxx_name2() functions for "separate static buffer"
- Just fire off LOGP() args and the code gets skipped if the category / log
  level is switched off:
LOGP(DMSC, LOGL_DEBUG, 
      "PLMN changed from %s to %s, but Fred says it was %s\n",
      osmo_plmn_name(old_plmn),
      osmo_plmn_name(new_plmn),
      osmo_plmn_name(freds_plmn));
I think it would be nice to have this mechanism for easier string hacking,
but not sure if the tradeoffs / vague side effects are worth it.
So this might as well remain a fun idea that never made it.
Thoughts welcome.
Implementation considerations...
At first I thought a plain ring buffer returning the next bit of memory every
time would suffice, but I didn't like the need to vaguely trust that we never
use overlapping memory.
The best I could come up with was to guarantee up to N subsequent static
strings as non-overlapping. See the semantics around the 'recent' members.
But that entails an uncomfortable tradeoff:
I kind of expected to reduce the size of static char buffers in libosmocore,
but if I want to disallow overlapping N uses, the buffer has to be able to hold
N subsequent allocations of the largest buffer, which currently is 4100. So I
end up with a larger static char buffer. The advantage however is that every
libosmocore using program with its own xxx_name() functions can now use the
same static string space and does not need to add more static buffers all over
the place.
To make the buffer smaller, I considered taking the largest buffer callers out
of this, i.e. osmo_hexdump() and msgb_hexdump(). But particularly
osmo_hexdump() is one of the functions that I would like to be able to call
more than once in a printf() arg list. Actually that would in practice be of
smaller size than the maximum 4096 currently allowed. But the 'recent' check
above requires the *largest* size to fully fit N times.
Consider: if I choose a total buffer of 4096 bytes and print one hexdump of
3084 bytes, then I can't print another such buffer: the osmo_static_string()
function can't know that the first printf() has already concluded, and still
considers that large chunk as in-use. So to be able to allow any number of
subsequent printf()s of the same huge size, I need the number-of-recent-checks
times that large size to fit in the static buffer.
So there is a tradeoff between how many previous buffers I can guarantee to be
unchanged vs. the maximum buffer size allowed per allocation.
I would actually like to ensure 10 or more recent buffers, so that a caller can
say: I can use up to 10 various osmo_xxx_name() functions in the same printf()
statement, and libosmocore will barf if I use too much.
I would also like to use less static char buffer -- but is ten times 4096 as
static string buffer really a lot? Not for an x86_64 desktop machine, but what
about embedded targets?
Since in practice the name buffer sizes we use are more like <80 chars each, in
practice we would be totally fine with one 4096 char buffer.
I have never seen a hexdump of 4Kb in practice.
I'm considering actually removing the 'recent' check again and just offloading
the responsibilty for name buffer usage to the caller.
Or I'm considering to reduce the max hexdump / msgb_hexdump size to something
saner like 1024 chars or even 256 chars. (Actually, hexdump could also loop in
chunks of 64 bytes or something like that).
Another consideration was about how to effect bounds checking. I don't want to
evaluate return values in printf() args. So I decided to provide generalized
API that returns NULL pointers if the size did not fit, but for convenience use
have one osmo_static_string() function that OSMO_ASSERT()s as soon as the size
doesn't fit, instead.
~N
-- 
- Neels Hofmeyr nhofmeyr@sysmocom.de          http://www.sysmocom.de/
=======================================================================
* sysmocom - systems for mobile communications GmbH
* Alt-Moabit 93
* 10559 Berlin, Germany
* Sitz / Registered office: Berlin, HRB 134158 B
* Geschäftsführer / Managing Directors: Harald Welte