neels has submitted this change. ( https://gerrit.osmocom.org/c/osmo-bsc/+/28061 )
Change subject: fix fallout from: 'stats: new trackers for lchan life duration' ......................................................................
fix fallout from: 'stats: new trackers for lchan life duration'
In lchan_fsm_cleanup(), ensure that the time_cc timer is actually inactive before deallocating. Do so via lchan_reset(), to also make sure the timer is stopped in all other situations where the lchan is deactivated.
This fixes an infinite-loop deadlock as described in OS#5554: - run BSC_Tests.TC_chan_act_ack_est_ind_noreply - restart the BTS process after the test is done - osmo-bsc enters infinite loop in osmo_timer_del()
The reason is that lchan_fsm_cleanup() fails to stop a running active_cc timer upon lchan deallocation. TC_chan_act_ack_est_ind_noreply incidentally terminates OML while the timer is still active.
Related: OS#5554 Change-Id: I901bb86a78d7d021c8efe751fd9d93e5956ac0e0 --- M src/osmo-bsc/lchan_fsm.c 1 file changed, 8 insertions(+), 0 deletions(-)
Approvals: neels: Looks good to me, approved pespin: Looks good to me, but someone else must approve osmith: Looks good to me, but someone else must approve Jenkins Builder: Verified
diff --git a/src/osmo-bsc/lchan_fsm.c b/src/osmo-bsc/lchan_fsm.c index d693189..6854465 100644 --- a/src/osmo-bsc/lchan_fsm.c +++ b/src/osmo-bsc/lchan_fsm.c @@ -528,6 +528,14 @@ lchan->mgw_endpoint_ci_bts = NULL; }
+ /* Ensure that the osmo_timer in lchan->active_cc is stopped. This is particularly important for lchan FSM + * deallocation, so that the timer is no longer active when the lchan FSM instance gets discarded + * (lchan_fsm_cleanup() calls this function), see OS#5554. + * + * Besides that, it is also good to make sure the timer is stopped when the lchan resets, to avoid any false + * counts being accumulated, however obscure an error situation may be. */ + osmo_time_cc_cleanup(&lchan->active_cc); + /* NUL all volatile state */ *lchan = (struct gsm_lchan){ .ts = lchan->ts,