Attention is currently required from: pespin, dexter.
Hello osmith, Jenkins Builder, laforge, dexter,
I'd like you to reexamine a change. Please visit
https://gerrit.osmocom.org/c/osmo-mgw/+/33312
to look at the new patch set (#3).
Change subject: mgw: Allow auditing speciall 'null' endpoint
......................................................................
mgw: Allow auditing speciall 'null' endpoint
This is a special endpoint which can always be audited. This is useful
for clients who wish to submit requests to osmo-mgw periodically to find
out whether the MGW is still reachable. This endpoint will be used by
libomso-mgcp-client as default target endpoint to implement such
feature.
This "null" Endpoint is osmo-mgw specific, not described in MGCP specs.
Related: SYS#6481
Change-Id: Ia409b16e9211e6261e2e0f21288544289d6f3733
---
M doc/manuals/chapters/mgcp_endpoints.adoc
M include/osmocom/mgcp/mgcp_endp.h
M src/libosmo-mgcp/mgcp_endp.c
M src/libosmo-mgcp/mgcp_protocol.c
M tests/mgcp/mgcp_test.c
M tests/mgcp/mgcp_test.ok
6 files changed, 131 insertions(+), 8 deletions(-)
git pull ssh://gerrit.osmocom.org:29418/osmo-mgw refs/changes/12/33312/3
--
To view, visit https://gerrit.osmocom.org/c/osmo-mgw/+/33312
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings
Gerrit-Project: osmo-mgw
Gerrit-Branch: master
Gerrit-Change-Id: Ia409b16e9211e6261e2e0f21288544289d6f3733
Gerrit-Change-Number: 33312
Gerrit-PatchSet: 3
Gerrit-Owner: pespin <pespin(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: dexter <pmaier(a)sysmocom.de>
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-Reviewer: osmith <osmith(a)sysmocom.de>
Gerrit-CC: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Attention: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: dexter <pmaier(a)sysmocom.de>
Gerrit-MessageType: newpatchset
Attention is currently required from: laforge, pespin.
Hello osmith, Jenkins Builder, laforge, dexter,
I'd like you to reexamine a change. Please visit
https://gerrit.osmocom.org/c/osmo-mgw/+/33308
to look at the new patch set (#5).
Change subject: mgcp-client: Add keepalive feature
......................................................................
mgcp-client: Add keepalive feature
The `keepalive` feature in libosmo-mgcp-client allows scheduling periodical
queries on the MGCP layer in order to make sure it is reachable and hence
obtain information on the state of the MGCP link.
This patch only uses it to print the status on the VTY, but it will be used
too in a follow-up commit by the MGW Pool when picking an MGW from the pool:
MGWs whose link is considered to be DOWN are skipped.
The feature consists of:
- A `keepalive request-interval` which will trigger a transmission of an MGCP
AuditEndpoint command targeting endpoint with name `keepalive request-endpoint`.
This interval is updated every time any message is transmitted in the MGCP
link, meaning the MGCP AuditEndpoint message is only triggered if no message
has been transmitted since `keepalive request-interval` seconds ago.
- A `keepalive timeout` considering the MGW to be non-reachable (link DOWN) if
no message is received over that amount of time.
The `keepalive` parameters are to be preferrably configured so that
"keepalive request-interval" * 2 < "keepalive timeout".
Example VTY configuration of `keepalive` feature in libosmo-mgcp-client:
----
mgw 0
...
keepalive request-interval 20 <1>
keepalive request-endpoint null <2>
keepalive timeout 50 <3>
----
<1> Transmit an MGCP AuditEndpoint message to the MGW if no message has been
sent to it over last 10 seconds
<2> The MGCP AuditEndpoint targets the `null` endpoint. This is a special
endpoint available at OsmoMGW for those purposes, but any available
endpoint can be configured and used instead.
<3> Consider the MGCP link to be DOWN if no message is received from the
MGW over the last 50 seconds
NOTE: The `keepalive` feature is disabled by default, and must be explicitly
configured in order to enable it.
Related: SYS#6481
Change-Id: I3dc74c78548d017f272da863d5282dc5e0020ca3
---
M include/osmocom/mgcp_client/mgcp_client.h
M include/osmocom/mgcp_client/mgcp_client_internal.h
M src/libosmo-mgcp-client/mgcp_client.c
M src/libosmo-mgcp-client/mgcp_client_vty.c
M tests/mgcp_client/mgcp_client_test.err
5 files changed, 266 insertions(+), 16 deletions(-)
git pull ssh://gerrit.osmocom.org:29418/osmo-mgw refs/changes/08/33308/5
--
To view, visit https://gerrit.osmocom.org/c/osmo-mgw/+/33308
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings
Gerrit-Project: osmo-mgw
Gerrit-Branch: master
Gerrit-Change-Id: I3dc74c78548d017f272da863d5282dc5e0020ca3
Gerrit-Change-Number: 33308
Gerrit-PatchSet: 5
Gerrit-Owner: pespin <pespin(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: dexter <pmaier(a)sysmocom.de>
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-Reviewer: osmith <osmith(a)sysmocom.de>
Gerrit-CC: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Attention: laforge <laforge(a)osmocom.org>
Gerrit-Attention: pespin <pespin(a)sysmocom.de>
Gerrit-MessageType: newpatchset
Attention is currently required from: neels, laforge.
pespin has posted comments on this change. ( https://gerrit.osmocom.org/c/osmo-hnbgw/+/33173 )
Change subject: cnpool: add context_map_cnlink_lost() handling
......................................................................
Patch Set 5:
(1 comment)
File include/osmocom/hnbgw/context_map.h:
https://gerrit.osmocom.org/c/osmo-hnbgw/+/33173/comment/da91e8e5_c3f9158a
PS3, Line 70: MAP_SCCP_EV_CN_LINK_LOST,
> So yes, I don't want to fixate on a specific reason the caller has.
Thanks, that's all I was asking. Then since this is the case can you maybe extend/change a bit the comment above it to explain it's not only about RANAP RESET but about any lower level failures in SCTP/RANAP?
--
To view, visit https://gerrit.osmocom.org/c/osmo-hnbgw/+/33173
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings
Gerrit-Project: osmo-hnbgw
Gerrit-Branch: master
Gerrit-Change-Id: Ic0a6fcfb747dc093528ca2bd12a269ad390d465c
Gerrit-Change-Number: 33173
Gerrit-PatchSet: 5
Gerrit-Owner: neels <nhofmeyr(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-CC: laforge <laforge(a)osmocom.org>
Gerrit-CC: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: neels <nhofmeyr(a)sysmocom.de>
Gerrit-Attention: laforge <laforge(a)osmocom.org>
Gerrit-Comment-Date: Thu, 15 Jun 2023 07:49:05 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: neels <nhofmeyr(a)sysmocom.de>
Comment-In-Reply-To: laforge <laforge(a)osmocom.org>
Comment-In-Reply-To: pespin <pespin(a)sysmocom.de>
Gerrit-MessageType: comment
Attention is currently required from: laforge, pespin.
neels has posted comments on this change. ( https://gerrit.osmocom.org/c/osmo-hnbgw/+/33173 )
Change subject: cnpool: add context_map_cnlink_lost() handling
......................................................................
Patch Set 5:
(1 comment)
File include/osmocom/hnbgw/context_map.h:
https://gerrit.osmocom.org/c/osmo-hnbgw/+/33173/comment/ef9087e2_482997fc
PS3, Line 70: MAP_SCCP_EV_CN_LINK_LOST,
> The main problem here is that "LINK_LOST" is too generic and difficult to understand from reader poi […]
There is a reason why this name is vague, on purpose: It is not this FSM's concern why this SCCP conn should go away and discard ungracefully. It might be an SCTP ABORT or a RANAP RESET or whatever. Incidentally it is so far exactly a RANAP RESET on connectionless transfer, but what may be the root reason is out of scope of this FSM. The caller has concrete reasons with accurate names, this FSM here only needs to carry out a specific variant of shutdown. Various other causes may apply in the future.
For the same reason, USER_ABORT is not called VTY_COMMAND or SCCP_RESTART -- those names happen to be accurate now. But maybe it will be triggered by the CTRL interface in the future, too, or maybe some other command than VTY 'apply sccp' will need an SCCP shutdown caused by the user; maybe a new vty command 'no msc 0' (just hypothetical). So it makes sense to keep the name as "vague" as it can be while still describing its effect.
Similarly for LINK_LOST, I am trying to, within the scope and realm of this FSM, to pick a name that is general enough for the future, without making assumptions on the reason the caller might have.
The main difference is that usually SCCP disconnection should still attempt to gracefully release the UE / give time for that to happen initiated by the CN. In contrast, when this LINK_LOST event happens, it means that there is no point in caring about the SCCP connection, because we know, for whichever reason the caller may have, that it will no longer work anyway.
So yes, I don't want to fixate on a specific reason the caller has.
--
To view, visit https://gerrit.osmocom.org/c/osmo-hnbgw/+/33173
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings
Gerrit-Project: osmo-hnbgw
Gerrit-Branch: master
Gerrit-Change-Id: Ic0a6fcfb747dc093528ca2bd12a269ad390d465c
Gerrit-Change-Number: 33173
Gerrit-PatchSet: 5
Gerrit-Owner: neels <nhofmeyr(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-CC: laforge <laforge(a)osmocom.org>
Gerrit-CC: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: laforge <laforge(a)osmocom.org>
Gerrit-Attention: pespin <pespin(a)sysmocom.de>
Gerrit-Comment-Date: Thu, 15 Jun 2023 03:12:03 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: neels <nhofmeyr(a)sysmocom.de>
Comment-In-Reply-To: laforge <laforge(a)osmocom.org>
Comment-In-Reply-To: pespin <pespin(a)sysmocom.de>
Gerrit-MessageType: comment
Attention is currently required from: laforge, fixeria, pespin.
neels has posted comments on this change. ( https://gerrit.osmocom.org/c/osmo-hnbgw/+/33169 )
Change subject: fix umts_cell_id_name(): show CID
......................................................................
Patch Set 4:
(1 comment)
Commit Message:
https://gerrit.osmocom.org/c/osmo-hnbgw/+/33169/comment/faaf6407_eeeac9b9
PS3, Line 12: Use OTC_SELECT
Let me add that I don't know how it happened, but i overlooked some responses when writing the above.
I appreciate the benchmark:
> the static __thread buffer implementation 1294870 times per second on my laptop, while the OTC_SELECT variant runs "only" 808880 times per second.
It would also be interesting to compare whether there is a difference between the two variants when the logging is turned off: is there an impact by unused static buffers?
> I still don't get in general, why we should use a more expensive OTC_SELECT approach over a static buffer approach *without a clear requirement or reason*. It's not that it makes the code any easier to read or write.
But clearly, it does, and clearly there are reasons for it, and requirements that OTC_SELECT meets better than static buffers do.
Using OTC_SELECT helps to completely rule out several entire families of problems in code maintenance, <cue the previous long comment>
--
To view, visit https://gerrit.osmocom.org/c/osmo-hnbgw/+/33169
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings
Gerrit-Project: osmo-hnbgw
Gerrit-Branch: master
Gerrit-Change-Id: Id903b8579ded8b908e644808e440d373bcca8da4
Gerrit-Change-Number: 33169
Gerrit-PatchSet: 4
Gerrit-Owner: neels <nhofmeyr(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-Reviewer: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: laforge <laforge(a)osmocom.org>
Gerrit-Attention: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Attention: pespin <pespin(a)sysmocom.de>
Gerrit-Comment-Date: Thu, 15 Jun 2023 02:51:29 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: neels <nhofmeyr(a)sysmocom.de>
Comment-In-Reply-To: laforge <laforge(a)osmocom.org>
Comment-In-Reply-To: pespin <pespin(a)sysmocom.de>
Gerrit-MessageType: comment
Attention is currently required from: laforge, fixeria, pespin.
neels has posted comments on this change. ( https://gerrit.osmocom.org/c/osmo-hnbgw/+/33169 )
Change subject: fix umts_cell_id_name(): show CID
......................................................................
Patch Set 4:
(1 comment)
Commit Message:
https://gerrit.osmocom.org/c/osmo-hnbgw/+/33169/comment/3121b45d_eebbfcfa
PS3, Line 12: Use OTC_SELECT
This decision is not trivial. I am not prepared to just give away the multitude of benefits and elegance that using OTC_SELECT brings, without a very compelling reason.
If we weigh in all factors, I end up ranking OTC_SELECT as the better choice, even for code that happens 1000 times per second. The discussion is quite interesting, still we also have to actually discuss it.
But the TL;DR is: deciding this stuff is not trivial, and you guys should not simply claim that some other way is better when there is no substance to it; it makes no sense to vote whether 2+2==5.
If there is a compelling argument, I'll be very glad to adjust the code accordingly. So far I see only concrete arguments for OTC_SELECT, and vague handwavy claims against it.
So let's not use this rather complex discussion of nuances to block a patch that may-or-may-not be improved by applying the requested code review.
*If* we want to discuss this, maybe rather on the ML, then these seem to me the most important points:
Take these two examples:
char buf[128];
char buf2[128];
LOG(LOGL_DEBUG, "%s and %s\n",
foo_to_str_buf(buf, sizeof(buf), &foo),
foo_to_str_buf(buf2, sizeof(buf2), &foo2));
versus
LOG(LOGL_DEBUG, "%s and %s\n",
foo_to_str_c(OTC_SELECT, &foo),
foo_to_str_c(OTC_SELECT, &foo2));
Load efficiency 1: In the first version, the two buffers are always reserved on the stack, even if the LOG() is skipped due to level > LOGL_DEBUG. This happens for all calls of this code and cannot be tweaked away by adjusting logging categories in the cfg. In the second version, there is exactly no memory reserved when level > LOGL_DEBUG.
Load efficiency 2: I trust that rapid dynamic memory allocation and deallocation of the same sub-kilobyte buffer is fast. Change my mind.
Memory efficiency: memory fragmentation can be an issue, leading to proliferation of much higher memory usage than necessary, as the program runs. That's why the to_str_c() functions typically allocate a power of two by default. The cause is that long lived objects may be allocated between a LOG() and the free of OTC_SELECT. If we want to avoid mem fragmentation from logging, then the much better answer would be this:
Add an OTC_LOG, i.e. a talloc context that gets freed once libosmocore logging.c is done outputting a log line on all targets. This gets freed without any long lived objects possibly being allocated in-between, and we are basically completely rid of mem fragmentation from log lines.
I rank fragmentation very low, because
- we are in practice several orders of magnitude away from being limited by the amount of RAM that is available. (In practice, until recently, osmo-hnbgw was leaking *all* of its RUA messages *all of the time* into asn1_ctx, and even so it took a site with >20 HNBs days or weeks of live operation until it reached a memory limit and needed to restart.)
- our "long lived objects" created after program startup tend to have a limited lifetime, so fragmentation, if any, will usually go away in an hour or a day at the latest.
Bug resistance 1: in the first version, if the resulting string gets too large, output is truncated ungracefully, *without us noticing*. We constantly have to weigh the size of static buffers that are reserved (want it to be small) against the string length we want to guarantee to be visible (want it to be long); truncating strings gracefully adds a lot of nontrivial code. OTC_SELECT has no such problems *at all*.
Bug resistance 2: in the first version, there is a lot to get wrong. If we mix up buf and buf2, or accidentally use the same buffer twice, we get very confusing buggy log output that is hard to make sense of. Stressing the point that these bugs are hard to spot even when reading the code to the letter. OTC_SELECT has no such problems *at all*.
Code reusability: when the LOG() code is copied to some other function, the first version needs the local buffers to be declared in the other function as well; the OTC_SELECT log line can simply be copied anywhere else without giving any further thought to buffers and their sizes or their names in the local scope. There are no buffer related hidden copy-paste bugs possible with OTC_SELECT.
Magic numbers: for buffers on the stack, every caller decides on the buffer size separately. For the to_str_c() functions, the initial default buffer size is defined once and can be adjusted in a single place. (and also the initial buffer size is not a limit on the result.)
Readability: The first version clutters both the local var scope and the LOG arguments. It is much longer code.
In summary, it is hard to be worse than static buffers for logging (both globally or on the stack), because those bring in all sorts of problems. They are the opposite of elegant and they have been pestering me with obscure bugs every so often. If we need an alternative to OTC_SELECT in logging, we need one that is better than declaring static buffers.
To me it seems the only possible flaw in my argument might be that the point 'Load efficiency 2' is starkly wrong. Can you argue / prove that this is the case? In my experience with audio synthesis hacking, rapidly allocating and deallocating buffers is not causing a performance problem.
> I'm talking about the architectural/global implications of doing this in the long term. It may be only here now, but tomorrow will be there and there
My point being that using OTC_SELECT everywhere would be desirable.
As long as we are using static/stack buffers, we have cluttered and buggy code that is hard to maintain, read, and re-use.
> and we end up with a big clutter of things
I have shown above that OTC_SELECT *reduces* clutter.
> where people keep adding log lines to OTC_SELECT "because other code is already doing it".
This sounds like you are right, without actually any substance to the argument. You are still saying it is bad "because it is bad", which is not sufficient for a CR argument.
Let's see some actual reasoning, and if that is sound, I'll be happy to oblige.
So far, there is no *reason* against this code.
I'm pulling my hair here because I do not understand what you guys are getting at, because you are not providing any insights to carry the opinion. So far there is no actual substance to your CR anywhere. What am I missing??
--
To view, visit https://gerrit.osmocom.org/c/osmo-hnbgw/+/33169
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings
Gerrit-Project: osmo-hnbgw
Gerrit-Branch: master
Gerrit-Change-Id: Id903b8579ded8b908e644808e440d373bcca8da4
Gerrit-Change-Number: 33169
Gerrit-PatchSet: 4
Gerrit-Owner: neels <nhofmeyr(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-Reviewer: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: laforge <laforge(a)osmocom.org>
Gerrit-Attention: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Attention: pespin <pespin(a)sysmocom.de>
Gerrit-Comment-Date: Thu, 15 Jun 2023 02:41:42 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: neels <nhofmeyr(a)sysmocom.de>
Comment-In-Reply-To: laforge <laforge(a)osmocom.org>
Comment-In-Reply-To: pespin <pespin(a)sysmocom.de>
Gerrit-MessageType: comment
daniel has submitted this change. ( https://gerrit.osmocom.org/c/libosmo-netif/+/33319 )
Change subject: stream: Correctly close osmo_stream_cli when in state WAIT_RECONNECT
......................................................................
stream: Correctly close osmo_stream_cli when in state WAIT_RECONNECT
If cli is in STREAM_CLI_STATE_WAIT_RECONNECT cli->timer is scheduled.
Delete the timer if osmo_steam_cli_close() called so we don't end up
reconnecting. Also fixes the recent test failures in
master-libosmo-sccp Jenkins jobs.
Change-Id: Ic5227432192c4007f861f171ae961a8f4dea6522
---
M src/stream.c
1 file changed, 20 insertions(+), 0 deletions(-)
Approvals:
Jenkins Builder: Verified
pespin: Looks good to me, but someone else must approve
daniel: Looks good to me, approved
diff --git a/src/stream.c b/src/stream.c
index 6fb41f1..23efe24 100644
--- a/src/stream.c
+++ b/src/stream.c
@@ -308,6 +308,12 @@
{
if (cli->state == STREAM_CLI_STATE_CLOSED)
return;
+ if (cli->state == STREAM_CLI_STATE_WAIT_RECONNECT) {
+ osmo_timer_del(&cli->timer);
+ cli->state = STREAM_CLI_STATE_CLOSED;
+ return;
+ }
+
osmo_fd_unregister(&cli->ofd);
close(cli->ofd.fd);
cli->ofd.fd = -1;
--
To view, visit https://gerrit.osmocom.org/c/libosmo-netif/+/33319
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings
Gerrit-Project: libosmo-netif
Gerrit-Branch: master
Gerrit-Change-Id: Ic5227432192c4007f861f171ae961a8f4dea6522
Gerrit-Change-Number: 33319
Gerrit-PatchSet: 1
Gerrit-Owner: daniel <dwillmann(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: daniel <dwillmann(a)sysmocom.de>
Gerrit-Reviewer: pespin <pespin(a)sysmocom.de>
Gerrit-MessageType: merged