[S] Change in osmo-hnbgw[master]: fix umts_cell_id_name(): show CID - gerrit-log

15 Jun 2023

Attention is currently required from: laforge, fixeria, pespin.

neels has posted comments on this change. (
https://gerrit.osmocom.org/c/osmo-hnbgw/+/33169 )

Change subject: fix umts_cell_id_name(): show CID
......................................................................

Patch Set 4:

(1 comment)

Commit Message:

https://gerrit.osmocom.org/c/osmo-hnbgw/+/33169/comment/3121b45d_eebbfcfa 
PS3, Line 12: Use OTC_SELECT
This decision is not trivial. I am not prepared to just give away the multitude of
benefits and elegance that using OTC_SELECT brings, without a very compelling reason.

If we weigh in all factors, I end up ranking OTC_SELECT as the better choice, even for
code that happens 1000 times per second. The discussion is quite interesting, still we
also have to actually discuss it.

But the TL;DR is: deciding this stuff is not trivial, and you guys should not simply claim
that some other way is better when there is no substance to it; it makes no sense to vote
whether 2+2==5.

If there is a compelling argument, I'll be very glad to adjust the code accordingly.
So far I see only concrete arguments for OTC_SELECT, and vague handwavy claims against
it.

So let's not use this rather complex discussion of nuances to block a patch that
may-or-may-not be improved by applying the requested code review.

*If* we want to discuss this, maybe rather on the ML, then these seem to me the most
important points:

Take these two examples:

 char buf[128];
 char buf2[128];
 LOG(LOGL_DEBUG, "%s and %s\n",
     foo_to_str_buf(buf, sizeof(buf), &foo),
     foo_to_str_buf(buf2, sizeof(buf2), &foo2));

versus

 LOG(LOGL_DEBUG, "%s and %s\n",
     foo_to_str_c(OTC_SELECT, &foo),
     foo_to_str_c(OTC_SELECT, &foo2));

Load efficiency 1: In the first version, the two buffers are always reserved on the stack,
even if the LOG() is skipped due to level > LOGL_DEBUG. This happens for all calls of
this code and cannot be tweaked away by adjusting logging categories in the cfg. In the
second version, there is exactly no memory reserved when level > LOGL_DEBUG.

Load efficiency 2: I trust that rapid dynamic memory allocation and deallocation of the
same sub-kilobyte buffer is fast. Change my mind.

Memory efficiency: memory fragmentation can be an issue, leading to proliferation of much
higher memory usage than necessary, as the program runs. That's why the to_str_c()
functions typically allocate a power of two by default. The cause is that long lived
objects may be allocated between a LOG() and the free of OTC_SELECT. If we want to avoid
mem fragmentation from logging, then the much better answer would be this:
Add an OTC_LOG, i.e. a talloc context that gets freed once libosmocore logging.c is done
outputting a log line on all targets. This gets freed without any long lived objects
possibly being allocated in-between, and we are basically completely rid of mem
fragmentation from log lines.
I rank fragmentation very low, because
- we are in practice several orders of magnitude away from being limited by the amount of
RAM that is available. (In practice, until recently, osmo-hnbgw was leaking *all* of its
RUA messages *all of the time* into asn1_ctx, and even so it took a site with >20 HNBs
days or weeks of live operation until it reached a memory limit and needed to restart.)
- our "long lived objects" created after program startup tend to have a limited
lifetime, so fragmentation, if any, will usually go away in an hour or a day at the
latest.

Bug resistance 1: in the first version, if the resulting string gets too large, output is
truncated ungracefully, *without us noticing*. We constantly have to weigh the size of
static buffers that are reserved (want it to be small) against the string length we want
to guarantee to be visible (want it to be long); truncating strings gracefully adds a lot
of nontrivial code. OTC_SELECT has no such problems *at all*.

Bug resistance 2: in the first version, there is a lot to get wrong. If we mix up buf and
buf2, or accidentally use the same buffer twice, we get very confusing buggy log output
that is hard to make sense of. Stressing the point that these bugs are hard to spot even
when reading the code to the letter. OTC_SELECT has no such problems *at all*.

Code reusability: when the LOG() code is copied to some other function, the first version
needs the local buffers to be declared in the other function as well; the OTC_SELECT log
line can simply be copied anywhere else without giving any further thought to buffers and
their sizes or their names in the local scope. There are no buffer related hidden
copy-paste bugs possible with OTC_SELECT.

Magic numbers: for buffers on the stack, every caller decides on the buffer size
separately. For the to_str_c() functions, the initial default buffer size is defined once
and can be adjusted in a single place. (and also the initial buffer size is not a limit on
the result.)

Readability: The first version clutters both the local var scope and the LOG arguments. It
is much longer code.

In summary, it is hard to be worse than static buffers for logging (both globally or on
the stack), because those bring in all sorts of problems. They are the opposite of elegant
and they have been pestering me with obscure bugs every so often. If we need an
alternative to OTC_SELECT in logging, we need one that is better than declaring static
buffers.

To me it seems the only possible flaw in my argument might be that the point 'Load
efficiency 2' is starkly wrong. Can you argue / prove that this is the case? In my
experience with audio synthesis hacking, rapidly allocating and deallocating buffers is
not causing a performance problem.

...
  I'm talking about the architectural/global
implications of doing this in the long term. It may be only here now, but tomorrow will be
there and there 
My point being that using OTC_SELECT everywhere would be desirable.
As long as we are using static/stack buffers, we have cluttered and buggy code that is
hard to maintain, read, and re-use.

...
  and we end up with a big clutter of things 
I have shown above that OTC_SELECT *reduces* clutter.

...
  where people keep adding log lines to OTC_SELECT
"because other code is already doing it". 
This sounds like you are right, without actually any substance to the argument. You are
still saying it is bad "because it is bad", which is not sufficient for a CR
argument.

Let's see some actual reasoning, and if that is sound, I'll be happy to oblige.
So far, there is no *reason* against this code.

I'm pulling my hair here because I do not understand what you guys are getting at,
because you are not providing any insights to carry the opinion. So far there is no actual
substance to your CR anywhere. What am I missing??

-- 
To view, visit https://gerrit.osmocom.org/c/osmo-hnbgw/+/33169
To unsubscribe, or for help writing mail filters, visit
https://gerrit.osmocom.org/settings

Gerrit-Project: osmo-hnbgw
Gerrit-Branch: master
Gerrit-Change-Id: Id903b8579ded8b908e644808e440d373bcca8da4
Gerrit-Change-Number: 33169
Gerrit-PatchSet: 4
Gerrit-Owner: neels &lt;nhofmeyr(a)sysmocom.de&gt;
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: fixeria &lt;vyanitskiy(a)sysmocom.de&gt;
Gerrit-Reviewer: laforge &lt;laforge(a)osmocom.org&gt;
Gerrit-Reviewer: pespin &lt;pespin(a)sysmocom.de&gt;
Gerrit-Attention: laforge &lt;laforge(a)osmocom.org&gt;
Gerrit-Attention: fixeria &lt;vyanitskiy(a)sysmocom.de&gt;
Gerrit-Attention: pespin &lt;pespin(a)sysmocom.de&gt;
Gerrit-Comment-Date: Thu, 15 Jun 2023 02:41:42 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: neels &lt;nhofmeyr(a)sysmocom.de&gt;
Comment-In-Reply-To: laforge &lt;laforge(a)osmocom.org&gt;
Comment-In-Reply-To: pespin &lt;pespin(a)sysmocom.de&gt;
Gerrit-MessageType: comment