fixeria has uploaded this change for review. ( https://gerrit.osmocom.org/c/osmo-bts/+/42851?usp=email )
Change subject: osmo-bts-trx: shut down on stale clock indication from transceiver ......................................................................
osmo-bts-trx: shut down on stale clock indication from transceiver
We expect the transceiver to be a reliable, monotonic clock source. If it reports an FN far behind our local timer (elapsed_fn < 0) while far more wall-clock time elapsed than its FN advance accounts for, its clock has likely stalled and the indication carries a stale frame number. Acting on it drags the scheduler backwards and re-transmits already-sent TDMA frames, corrupting lchan-internal state(s).
Detect this and shut down the process, same rationale as the existing "PC clock skew too high" check in trx_fn_timer_cb().
Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com Change-Id: If787ab7ed70aa2dcb0389ceb58620c2302c3431a Related: OS#7020, OS#6794 --- M src/osmo-bts-trx/scheduler_trx.c 1 file changed, 16 insertions(+), 0 deletions(-)
git pull ssh://gerrit.osmocom.org:29418/osmo-bts refs/changes/51/42851/1
diff --git a/src/osmo-bts-trx/scheduler_trx.c b/src/osmo-bts-trx/scheduler_trx.c index d791eaf..8c6f547 100644 --- a/src/osmo-bts-trx/scheduler_trx.c +++ b/src/osmo-bts-trx/scheduler_trx.c @@ -582,6 +582,22 @@
/* check for max clock skew */ if (elapsed_fn > MAX_FN_SKEW || elapsed_fn < -MAX_FN_SKEW) { + /* If the transceiver reports an FN far BEHIND our local timer + * (elapsed_fn < 0) while far more wall-clock time elapsed than its FN + * advance accounts for (error_us_since_clk large positive), then its + * clock has stalled and this CLCK.ind carries a stale (outdated) frame + * number. Acting on a stale indication would drag the scheduler backwards + * and corrupt lchan-internal state(s). Treat this as a fatal condition + * and shut down -- same rationale as the "PC clock skew too high" + * check in trx_fn_timer_cb(). */ + if (elapsed_fn < 0 && + error_us_since_clk > (int64_t)GSM_TDMA_FN_DURATION_uS * MAX_FN_SKEW) { + LOGP(DL1C, LOGL_FATAL, "Stale CLCK.ind: fn=%u is %"PRId64" us behind\n", + fn, error_us_since_clk); + osmo_timerfd_disable(&tcs->fn_timer_ofd); + bts_shutdown(bts, "TRX clock skew too high"); + return -1; + } LOGP(DL1C, LOGL_NOTICE, "GSM clock skew: old fn=%u, " "new fn=%u\n", tcs->last_fn_timer.fn, fn); return trx_setup_clock(bts, tcs, &tv_now, &interval, fn);