As a bit of relaxation in-between the serious hacking, I gave a little idea of mine a spin, which could improve our use of static string buffers.
My idea is to use one common static buffer string, as a ring buffer, for all our osmo_xxx_name() functions, that returns new chunks of static string every time. Why?
The point is that we have many osmo_xxx_name() functions that return a string in one single static buffer. When you call it a second time, it overwrites the buffer. So this doesn't work:
printf("PLMN changed from %s to %s\n", osmo_plmn_name(old_plmn), osmo_plmn_name(new_plmn));
No matter what, this will always print the same string twice, and probably no-one would notice.
For this, I have previously often invented xxx_name2() functions:
printf("PLMN changed from %s to %s\n", osmo_plmn_name(old_plmn), osmo_plmn_name2(new_plmn));
With osmo_plmn_name2() being:
"Same as osmo_plmn_name(), but returning in a different static buffer."
That is bad for these reasons:
- each new separate function adds another static buffer to the program size, which remains unused all the other time. - each new separate function adds more API signatures.
Alternatively we have xxx_buf() functions, but they are more cumbersome to use:
char buf[123]; gsm0808_cell_id_list_name_buf(buf, sizeof(buf), cil); LOGP(DMSC, LOGL_DEBUG, "%s\n", buf);
Particularly for logging, that would always run even though the logging category might be switched off.
So I came up with these patches, and changing all of current libosmocore's static buffers to using this does work:
http://git.osmocom.org/libosmocore/log/?h=neels/static_strings (second last patch introduces the API, last patch changes all libosmocore callers)
Also pushed to gerrit for easier review: https://gerrit.osmocom.org/13061
the benefits:
- All osmo_xxx_name() functions can be called N times in a single printf() statement, without the need for managing buffers.
- No osmo_xxx_name2() functions for "separate static buffer"
- Just fire off LOGP() args and the code gets skipped if the category / log level is switched off:
LOGP(DMSC, LOGL_DEBUG, "PLMN changed from %s to %s, but Fred says it was %s\n", osmo_plmn_name(old_plmn), osmo_plmn_name(new_plmn), osmo_plmn_name(freds_plmn));
I think it would be nice to have this mechanism for easier string hacking, but not sure if the tradeoffs / vague side effects are worth it. So this might as well remain a fun idea that never made it.
Thoughts welcome.
Implementation considerations...
At first I thought a plain ring buffer returning the next bit of memory every time would suffice, but I didn't like the need to vaguely trust that we never use overlapping memory.
The best I could come up with was to guarantee up to N subsequent static strings as non-overlapping. See the semantics around the 'recent' members. But that entails an uncomfortable tradeoff:
I kind of expected to reduce the size of static char buffers in libosmocore, but if I want to disallow overlapping N uses, the buffer has to be able to hold N subsequent allocations of the largest buffer, which currently is 4100. So I end up with a larger static char buffer. The advantage however is that every libosmocore using program with its own xxx_name() functions can now use the same static string space and does not need to add more static buffers all over the place.
To make the buffer smaller, I considered taking the largest buffer callers out of this, i.e. osmo_hexdump() and msgb_hexdump(). But particularly osmo_hexdump() is one of the functions that I would like to be able to call more than once in a printf() arg list. Actually that would in practice be of smaller size than the maximum 4096 currently allowed. But the 'recent' check above requires the *largest* size to fully fit N times.
Consider: if I choose a total buffer of 4096 bytes and print one hexdump of 3084 bytes, then I can't print another such buffer: the osmo_static_string() function can't know that the first printf() has already concluded, and still considers that large chunk as in-use. So to be able to allow any number of subsequent printf()s of the same huge size, I need the number-of-recent-checks times that large size to fit in the static buffer.
So there is a tradeoff between how many previous buffers I can guarantee to be unchanged vs. the maximum buffer size allowed per allocation.
I would actually like to ensure 10 or more recent buffers, so that a caller can say: I can use up to 10 various osmo_xxx_name() functions in the same printf() statement, and libosmocore will barf if I use too much.
I would also like to use less static char buffer -- but is ten times 4096 as static string buffer really a lot? Not for an x86_64 desktop machine, but what about embedded targets?
Since in practice the name buffer sizes we use are more like <80 chars each, in practice we would be totally fine with one 4096 char buffer. I have never seen a hexdump of 4Kb in practice.
I'm considering actually removing the 'recent' check again and just offloading the responsibilty for name buffer usage to the caller.
Or I'm considering to reduce the max hexdump / msgb_hexdump size to something saner like 1024 chars or even 256 chars. (Actually, hexdump could also loop in chunks of 64 bytes or something like that).
Another consideration was about how to effect bounds checking. I don't want to evaluate return values in printf() args. So I decided to provide generalized API that returns NULL pointers if the size did not fit, but for convenience use have one osmo_static_string() function that OSMO_ASSERT()s as soon as the size doesn't fit, instead.
~N
Hi Neels,
I haven't yet ready our code, just briefly read your e-mail.
One more radical option (much more costly, for sure) would be to introcue somthing like a "packet processing talloc context" and use dynamic allocations:
So basically 1) every time we come out of select() and call a filedescriptor callback, we would create a new talloc context. 2) any code, including those that want to print messages to buffers, etc. could allocate memory from that context without having to care about releasing it 3) once we return from the file descriptor call-back, we talloc_free() that "master" context, taking with it all the child allocations hat may have been allocated underneath.
This is actually a bit inspired from wireshark, where AFAIK you can also alloctate dynamic memory within the context of a given packet.
Sure, dynamic allocations are more expensive. But in the end, aside from osmo-pcu on ARM926EJS sysmoBTS 1002, we've never really seen real performance problems in the use cases so far. And as you indicate, if the logging is turned off, the allocations would never be made. And we can make it OOM free by simply returning a static const char * "<OOM!>" in case the allocation should ever fail.
One might be able to get similar but slightly different semantics by attaching those strings to the 'msgb' that's currently being processed. At some point the msgb will be released. Depending on the use case, the memory would be allocated for more than the current fd-callback, e.g. in case of queueing the msgb somewhere.
Regards, Harald
On Tue, Feb 26, 2019 at 11:19:14PM +0100, Harald Welte wrote:
So basically
- every time we come out of select() and call a filedescriptor callback, we would create a new talloc context.
- any code, including those that want to print messages to buffers, etc. could allocate memory from that context without having to care about releasing it
- once we return from the file descriptor call-back, we talloc_free() that "master" context, taking with it all the child allocations hat may have been allocated underneath.
I like that approach.
If dyn allocations become a problem there could even be a large block that is re-used across select() cycles, and enlarged if necessary but otherwise staying in place. Probably will not be a problem though.
One might be able to get similar but slightly different semantics by attaching those strings to the 'msgb' that's currently being processed.
That would complicate API signatures, e.g. things like osmo_plmn_name()? I think that makes sense only for msgb_hexdump(), where a msgb is already part of the signature. But I'd say just using the same mechanism would be fine.
can make it OOM free by simply returning a static const char * "<OOM!>"
Yes, probably better than a program abort.
~N