laforge has uploaded this change for review. ( https://gerrit.osmocom.org/c/osmo-remsim/+/42300?usp=email )
Change subject: bankd: Avoid osmocom logging mutex deadlock in signal handling
......................................................................
bankd: Avoid osmocom logging mutex deadlock in signal handling
The main thread communicates slotmap add + delete via POSIX signals
to the worker threads. As those signals interrupt the normal
processing of the worker thread, they might get delivered while the
thread is already logging something, causing a deadlock. This has
been observed in the real world in the following stack trace (where it's
actually two nested signals):
As a hot-fix, let's avoid logging from the handle_sig_map{del,add}()
functions at all, making them safe against a deadlock around this mutex.
We should decide how to proceed in general with potentially some
architectural changes later on; any such changes are not suitable as a
hot fix due to their potential of introducing other regressions.
Change-Id: I5ea32886dfaf624b4dc5ad7924941c7b904c1d36
Related: SYS#7930
---
M src/bankd/bankd_main.c
1 file changed, 12 insertions(+), 10 deletions(-)
git pull ssh://gerrit.osmocom.org:29418/osmo-remsim refs/changes/00/42300/1
diff --git a/src/bankd/bankd_main.c b/src/bankd/bankd_main.c
index 1adc34c..5e33b41 100644
--- a/src/bankd/bankd_main.c
+++ b/src/bankd/bankd_main.c
@@ -521,9 +521,10 @@
static int worker_send_rspro(struct bankd_worker *worker, RsproPDU_t *pdu);
-static void worker_set_state(struct bankd_worker *worker, enum bankd_worker_state new_state)
+static void worker_set_state(struct bankd_worker *worker, enum bankd_worker_state new_state, bool quiet)
{
- LOGW(worker, "Changing state to %s\n", get_value_string(worker_state_names, new_state));
+ if (!quiet)
+ LOGW(worker, "Changing state to %s\n", get_value_string(worker_state_names, new_state));
worker->state = new_state;
worker->timeout = 0;
}
@@ -540,25 +541,26 @@
/* signal handler for receiving SIGMAPDEL from main thread */
static void handle_sig_mapdel(int sig)
{
- LOGW(g_worker, "SIGMAPDEL received: Main thread informs us our map is gone\n");
+ /* DO NOT LOG ANYTHING HERE, IT WILL DEADLOCK WITH THE osmo_log_tgt_mutex */
OSMO_ASSERT(sig == SIGMAPDEL);
if (g_worker->state >= BW_ST_CONN_CLIENT_MAPPED) {
g_worker->slot.bank_id = 0xffff;
g_worker->slot.slot_nr = 0xffff;
- worker_set_state(g_worker, BW_ST_CONN_CLIENT_UNMAPPED);
+ worker_set_state(g_worker, BW_ST_CONN_CLIENT_UNMAPPED, true);
}
}
/* signal handler for receiving SIGMAPADD from main thread */
static void handle_sig_mapadd(int sig)
{
- LOGW(g_worker, "SIGMAPADD received\n");
+ /* DO NOT LOG ANYTHING HERE, IT WILL DEADLOCK WITH THE osmo_log_tgt_mutex */
/* do nothing */
}
static void handle_sig_usr1(int sig)
{
OSMO_ASSERT(sig == SIGUSR1);
+ /* FIXME: WE SHOULD NOT LOG ANYTHING HERE, IT WILL DEADLOCK WITH THE osmo_log_tgt_mutex */
if (pthread_equal(g_bankd->main, pthread_self())) {
struct bankd_worker *worker;
@@ -612,7 +614,7 @@
if (rc < 0)
return rc;
- worker_set_state(worker, BW_ST_CONN_CLIENT_MAPPED_CARD);
+ worker_set_state(worker, BW_ST_CONN_CLIENT_MAPPED_CARD, false);
/* FIXME: notify client about this state change */
return 0;
@@ -763,7 +765,7 @@
}
worker->client.clslot.client_id = pdu->msg.choice.connectClientReq.clientSlot->clientId;
worker->client.clslot.slot_nr = pdu->msg.choice.connectClientReq.clientSlot->slotNr;
- worker_set_state(worker, BW_ST_CONN_CLIENT);
+ worker_set_state(worker, BW_ST_CONN_CLIENT, false);
if (worker_try_slotmap(worker) >= 0)
res = ResultCode_ok;
@@ -1041,7 +1043,7 @@
g_worker = (struct bankd_worker *) arg;
- worker_set_state(g_worker, BW_ST_INIT);
+ worker_set_state(g_worker, BW_ST_INIT, false);
/* not permitted in multithreaded environment */
talloc_disable_null_tracking();
@@ -1065,7 +1067,7 @@
g_worker->client.peer_addr_len = sizeof(g_worker->client.peer_addr);
- worker_set_state(g_worker, BW_ST_ACCEPTING);
+ worker_set_state(g_worker, BW_ST_ACCEPTING, false);
/* first wait for an incoming TCP connection */
rc = accept(g_worker->bankd->accept_fd, (struct sockaddr *) &g_worker->client.peer_addr,
&g_worker->client.peer_addr_len);
@@ -1075,7 +1077,7 @@
g_worker->client.fd = rc;
worker_client_addrstr(buf, sizeof(buf), g_worker);
LOGW(g_worker, "Accepted connection from %s\n", buf);
- worker_set_state(g_worker, BW_ST_CONN_WAIT_ID);
+ worker_set_state(g_worker, BW_ST_CONN_WAIT_ID, false);
/* run the main worker transceive loop body until there was some error */
while (1) {
--
To view, visit https://gerrit.osmocom.org/c/osmo-remsim/+/42300?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: newchange
Gerrit-Project: osmo-remsim
Gerrit-Branch: master
Gerrit-Change-Id: I5ea32886dfaf624b4dc5ad7924941c7b904c1d36
Gerrit-Change-Number: 42300
Gerrit-PatchSet: 1
Gerrit-Owner: laforge <laforge(a)osmocom.org>
Attention is currently required from: laforge.
fixeria has posted comments on this change by fixeria. ( https://gerrit.osmocom.org/c/erlang/osmo-s1gw/+/42295?usp=email )
Change subject: enb_registry: call enb_metrics_register/1 from a proper place
......................................................................
Patch Set 2:
(1 comment)
Patchset:
PS1:
> Would have been interesting to read why the old one was improper
Explained in the commit message.
--
To view, visit https://gerrit.osmocom.org/c/erlang/osmo-s1gw/+/42295?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: comment
Gerrit-Project: erlang/osmo-s1gw
Gerrit-Branch: master
Gerrit-Change-Id: I38237463aa9c968f89bf4f195407a18cba7e73c9
Gerrit-Change-Number: 42295
Gerrit-PatchSet: 2
Gerrit-Owner: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-Attention: laforge <laforge(a)osmocom.org>
Gerrit-Comment-Date: Mon, 09 Mar 2026 10:42:39 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: laforge <laforge(a)osmocom.org>
Attention is currently required from: fixeria, laforge.
Hello Jenkins Builder, laforge,
I'd like you to reexamine a change. Please visit
https://gerrit.osmocom.org/c/erlang/osmo-s1gw/+/42295?usp=email
to look at the new patch set (#2).
The following approvals got outdated and were removed:
Code-Review+1 by laforge
Change subject: enb_registry: call enb_metrics_register/1 from a proper place
......................................................................
enb_registry: call enb_metrics_register/1 from a proper place
enb_metrics_register/1 requires the Global-eNB-ID to register
per-eNB metrics. Calling it for every event, which may or may not
contain the Global-eNB-ID, is suboptimal.
Instead, invoke it from the enb_handle_event/2 clause that handles
the {s1setup, GENBId} event, where the Global-Enb-ID is guaranteed
to be available.
Change-Id: I38237463aa9c968f89bf4f195407a18cba7e73c9
---
M src/enb_registry.erl
1 file changed, 6 insertions(+), 7 deletions(-)
git pull ssh://gerrit.osmocom.org:29418/erlang/osmo-s1gw refs/changes/95/42295/2
--
To view, visit https://gerrit.osmocom.org/c/erlang/osmo-s1gw/+/42295?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: newpatchset
Gerrit-Project: erlang/osmo-s1gw
Gerrit-Branch: master
Gerrit-Change-Id: I38237463aa9c968f89bf4f195407a18cba7e73c9
Gerrit-Change-Number: 42295
Gerrit-PatchSet: 2
Gerrit-Owner: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-Attention: laforge <laforge(a)osmocom.org>
Gerrit-Attention: fixeria <vyanitskiy(a)sysmocom.de>
Attention is currently required from: fixeria.
pespin has posted comments on this change by fixeria. ( https://gerrit.osmocom.org/c/erlang/osmo-s1gw/+/42293?usp=email )
Change subject: pfcp_peer: replace watchdog process with a timer
......................................................................
Patch Set 1: Code-Review+1
--
To view, visit https://gerrit.osmocom.org/c/erlang/osmo-s1gw/+/42293?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: comment
Gerrit-Project: erlang/osmo-s1gw
Gerrit-Branch: master
Gerrit-Change-Id: I8d71ad8300feefb0aecbf690a825a2b4e9f1102c
Gerrit-Change-Number: 42293
Gerrit-PatchSet: 1
Gerrit-Owner: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-Reviewer: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Comment-Date: Mon, 09 Mar 2026 09:41:48 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: Yes
osmith has submitted this change. ( https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/42285?usp=email )
Change subject: hlr: pyhss: run the pyhss_hss service
......................................................................
hlr: pyhss: run the pyhss_hss service
Database preparations used to be done in PyHSS by all services if they
noticed that this was needed. The time between checking and creating the
tables caused a race condition where two services attempting to create
tables at the same time will result at one of them failing, we have seen
this in our ttcn3-hlr-test-pyhss jobs sometimes:
[Database] [DEBUG] Table apn already exists
[Database] [DEBUG] Table auc already exists
[Database] [DEBUG] Table subscriber already exists
…
[testenv][pyhss] pyhss_gsup: setup script failed
I have fixed this upstream by letting only the main service (pyhss_hss)
prepare the database:
https://github.com/nickvsnetworking/pyhss/commit/8b8a2202c345fbb7262c9d07d0…
This means that we now need to run the pyhss_hss service in the HLR
tests, so pyhss_gsup doesn't fail with:
[Database] [INFO] Waiting for the main service to prepare the database
ERROR: 127.0.0.1:4222 did not become available within 5s!
[testenv][pyhss] pyhss_gsup: setup script failed
Change-Id: I4fe689c8d8617432175ba403b45021c0f646970b
---
M hlr/testenv_pyhss.cfg
1 file changed, 5 insertions(+), 0 deletions(-)
Approvals:
Jenkins Builder: Verified
laforge: Looks good to me, but someone else must approve
fixeria: Looks good to me, approved
diff --git a/hlr/testenv_pyhss.cfg b/hlr/testenv_pyhss.cfg
index 7a70872..2e81987 100644
--- a/hlr/testenv_pyhss.cfg
+++ b/hlr/testenv_pyhss.cfg
@@ -10,6 +10,11 @@
package=no
copy=pyhss/redis.conf
+[pyhss_hss]
+program=cd ../pyhss_gsup && ./run_in_venv.sh pyhss_hss
+make=pyhss
+package=pyhss
+
[pyhss_gsup]
program=./run_in_venv.sh pyhss_gsup
setup=wait_for_port.py -p 4222
--
To view, visit https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/42285?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: merged
Gerrit-Project: osmo-ttcn3-hacks
Gerrit-Branch: master
Gerrit-Change-Id: I4fe689c8d8617432175ba403b45021c0f646970b
Gerrit-Change-Number: 42285
Gerrit-PatchSet: 2
Gerrit-Owner: osmith <osmith(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-Reviewer: osmith <osmith(a)sysmocom.de>
Gerrit-Reviewer: pespin <pespin(a)sysmocom.de>
osmith has submitted this change. ( https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/42286?usp=email )
Change subject: {hlr,hss}/pyhss/config: remove unused options
......................................................................
{hlr,hss}/pyhss/config: remove unused options
PyHSS had several unused config options. They have been removed
upstream, remove them from the osmo-ttcn3-hacks configs as well.
Related: https://github.com/nickvsnetworking/pyhss/pull/284
Change-Id: I87cf01e00fe0a3b32be9eaa4cf4c1ddf02cddc0b
---
M hlr/pyhss/config.yaml
M hss/pyhss/config.yaml
2 files changed, 0 insertions(+), 52 deletions(-)
Approvals:
fixeria: Looks good to me, but someone else must approve
laforge: Looks good to me, approved
Jenkins Builder: Verified
diff --git a/hlr/pyhss/config.yaml b/hlr/pyhss/config.yaml
index 87e6c53..61b819e 100644
--- a/hlr/pyhss/config.yaml
+++ b/hlr/pyhss/config.yaml
@@ -8,12 +8,6 @@
site_name: "TTCN3"
MCC: "001"
MNC: "01"
- SLh_enabled: False
- #IMSI of Test Subscriber for Unit Checks (Optional)
- test_sub_imsi: '001021234567890'
-
- #The maximum time to wait, in seconds, before disconnecting a client when no data is received.
- client_socket_timeout: 120
#The maximum time to wait, in seconds, before disconnecting a client when no data is received.
client_socket_timeout: 300
@@ -39,13 +33,6 @@
#If enabled sends CLRs to old MME when new MME attaches active sub
CancelLocationRequest_Enabled: False
- #Workaround for some MMEs to force an Insert Subscriber Data request to be sent immediately after ULA
- Insert_Subscriber_Data_Force: False
-
- #Default Initial Filter Criteria for IMS Subscribers
- #Jinja Formatted Template, see the example for variables passed to it.
- Default_iFC: 'default_ifc.xml'
-
#Default Sh User Data
Default_Sh_UserData: 'default_sh_user_data.xml'
@@ -124,11 +111,7 @@
database:
db_type: sqlite
server: 127.0.0.1
- username: dbeaver
- password: password
database: pyhss.db
- readCacheEnabled: True
- readCacheInterval: 60
## External Webhook Notifications
webhooks:
@@ -160,18 +143,6 @@
unixSocketPath: '/var/run/redis/redis-server.sock'
host: localhost
port: 6379
- sentinel:
- masterName: exampleMaster
- hosts:
- - exampleSentinel.mnc001.mcc001.3gppnetwork.org:
- port: 6379
- password: ''
-
-
-prometheus:
- enabled: False
- port: 8081 #If the API is run the API runs on the next port number up from this
- async_subscriber_count: False #If enabled the subscriber count will be updated asynchronously for Prometheus
influxdb:
enabled: False
diff --git a/hss/pyhss/config.yaml b/hss/pyhss/config.yaml
index 3114820..c951beb 100644
--- a/hss/pyhss/config.yaml
+++ b/hss/pyhss/config.yaml
@@ -8,12 +8,6 @@
site_name: "TTCN3"
MCC: "001"
MNC: "01"
- SLh_enabled: False
- #IMSI of Test Subscriber for Unit Checks (Optional)
- test_sub_imsi: '001021234567890'
-
- #The maximum time to wait, in seconds, before disconnecting a client when no data is received.
- client_socket_timeout: 120
#The maximum time to wait, in seconds, before disconnecting a client when no data is received.
client_socket_timeout: 300
@@ -39,13 +33,6 @@
#If enabled sends CLRs to old MME when new MME attaches active sub
CancelLocationRequest_Enabled: False
- #Workaround for some MMEs to force an Insert Subscriber Data request to be sent immediately after ULA
- Insert_Subscriber_Data_Force: False
-
- #Default Initial Filter Criteria for IMS Subscribers
- #Jinja Formatted Template, see the example for variables passed to it.
- Default_iFC: 'default_ifc.xml'
-
#Default Sh User Data
Default_Sh_UserData: 'default_sh_user_data.xml'
@@ -121,11 +108,7 @@
database:
db_type: sqlite
server: 127.0.0.1
- username: dbeaver
- password: password
database: pyhss.db
- readCacheEnabled: True
- readCacheInterval: 60
## External Webhook Notifications
webhooks:
@@ -157,12 +140,6 @@
unixSocketPath: '/var/run/redis/redis-server.sock'
host: localhost
port: 6379
- sentinel:
- masterName: exampleMaster
- hosts:
- - exampleSentinel.mnc001.mcc001.3gppnetwork.org:
- port: 6379
- password: ''
influxdb:
enabled: False
--
To view, visit https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/42286?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: merged
Gerrit-Project: osmo-ttcn3-hacks
Gerrit-Branch: master
Gerrit-Change-Id: I87cf01e00fe0a3b32be9eaa4cf4c1ddf02cddc0b
Gerrit-Change-Number: 42286
Gerrit-PatchSet: 2
Gerrit-Owner: osmith <osmith(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-Reviewer: osmith <osmith(a)sysmocom.de>