fixeria has uploaded this change for review. ( https://gerrit.osmocom.org/c/osmo-bts/+/42855?usp=email )
Change subject: osmo-bts-trx: fix spurious clock skew shutdown after self-compensation
......................................................................
osmo-bts-trx: fix spurious clock skew shutdown after self-compensation
When the BTS runs ahead of the transceiver (elapsed_fn < 0),
trx_sched_clock() reschedules the timerfd to deliberately delay the
next FN. osmo_timerfd_schedule() resets the timerfd and discards any
accumulated expirations, but last_fn_timer.tv was left pointing at
the previous callback. The next trx_fn_timer_cb() then measures
elapsed_us all the way back to that previous callback - spanning the
deliberate delay (or any OS stall that preceded us) - and falsely
trips the "PC clock skew too high" check, shutting the BTS down
for no good reason.
Advance last_fn_timer.tv to the projected firing time of the
rescheduled timer so that the next callback measures roughly
one FN interval, as expected.
Change-Id: Icdb7db8abe70258ae008d9514b6608bd74bb2881
AI-Assisted: yes (Claude)
Related: OS#6794
---
M src/osmo-bts-trx/scheduler_trx.c
1 file changed, 10 insertions(+), 3 deletions(-)
git pull ssh://gerrit.osmocom.org:29418/osmo-bts refs/changes/55/42855/1
diff --git a/src/osmo-bts-trx/scheduler_trx.c b/src/osmo-bts-trx/scheduler_trx.c
index 105ad38..a12ac3f 100644
--- a/src/osmo-bts-trx/scheduler_trx.c
+++ b/src/osmo-bts-trx/scheduler_trx.c
@@ -603,13 +603,20 @@
/* too many frames have been processed already */
if (elapsed_fn < 0) {
struct timespec first = interval;
- /* set clock to the time or last FN should have been
- * transmitted. */
+ /* set clock to the time our next FN has to be transmitted */
first.tv_nsec += (0 - elapsed_fn) * GSM_TDMA_FN_DURATION_nS;
normalize_timespec(&first);
LOGP(DL1C, LOGL_NOTICE, "We were %d FN faster than TRX, compensating\n", -elapsed_fn);
- /* set time to the time our next FN has to be transmitted */
osmo_timerfd_schedule(&tcs->fn_timer_ofd, &first, &interval);
+ /* Advance last_fn_timer.tv to the projected firing time of the rescheduled
+ * timer (shifted back one interval). osmo_timerfd_schedule() resets the
+ * timerfd, discarding any accumulated expirations, so without this update
+ * trx_fn_timer_cb() would measure elapsed_us all the way back to the previous
+ * callback — spanning the deliberate delay or any OS stall that preceded us —
+ * and falsely trip the clock skew threshold. */
+ tcs->last_fn_timer.tv = tv_now;
+ tcs->last_fn_timer.tv.tv_nsec += (0 - elapsed_fn) * GSM_TDMA_FN_DURATION_nS;
+ normalize_timespec(&tcs->last_fn_timer.tv);
return 0;
}
--
To view, visit https://gerrit.osmocom.org/c/osmo-bts/+/42855?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: newchange
Gerrit-Project: osmo-bts
Gerrit-Branch: master
Gerrit-Change-Id: Icdb7db8abe70258ae008d9514b6608bd74bb2881
Gerrit-Change-Number: 42855
Gerrit-PatchSet: 1
Gerrit-Owner: fixeria <vyanitskiy(a)sysmocom.de>
fixeria has uploaded this change for review. ( https://gerrit.osmocom.org/c/osmo-bts/+/42854?usp=email )
Change subject: common: track whether gsm_time has been initialized
......................................................................
common: track whether gsm_time has been initialized
l1sap_info_time_ind() used 'bts->gsm_time.fn != 0' as a proxy for
"we have a previous frame number to diff against". This is unreliable:
Fn=0 is a _valid_ frame number, recurring on every hyperframe wrap.
If gsm_time.fn happened to be 0 and the next time indication jumped
forward by more than one frame, the real gap was silently swallowed.
It also gave no clean way to suppress the bogus "Invalid condition
detected: Frame difference is ..." message that appears when the PHY
(re)starts its TDMA frame number (e.g. from 0) on bring-up.
Introduce an explicit 'bts->gsm_time_valid' flag instead:
* l1sap_info_time_ind() treats the first indication of an epoch as
having no gap (frames_expired = 0): no warning, no RACH-slot
accounting;
* the flag is cleared in st_op_disabled_notinstalled_on_enter(), so
each BTS bring-up starts a fresh clock epoch regardless of which
FN the PHY reports first.
Change-Id: I7022b0ad084a0c224f7e8c04aca0648915b1a1c6
AI-Assisted: yes (Claude)
Related: OS#7020
---
M include/osmo-bts/bts.h
M src/common/l1sap.c
M src/common/nm_bts_fsm.c
3 files changed, 17 insertions(+), 6 deletions(-)
git pull ssh://gerrit.osmocom.org:29418/osmo-bts refs/changes/54/42854/1
diff --git a/include/osmo-bts/bts.h b/include/osmo-bts/bts.h
index f2287b0..9691a03 100644
--- a/include/osmo-bts/bts.h
+++ b/include/osmo-bts/bts.h
@@ -337,6 +337,11 @@
uint8_t tc4_ctr;
} si;
struct gsm_time gsm_time;
+ /* false until the first PH-TIME.ind of the current clock epoch initializes
+ * gsm_time; reset in st_op_disabled_notinstalled_on_enter(). Used to tell a
+ * fresh clock start (PHY (re)started its TDMA FN, possibly from 0) apart from
+ * a real frame-number gap in l1sap_info_time_ind(). */
+ bool gsm_time_valid;
/* frame number statistics (FN in PH-RTS.ind vs. PH-DATA.ind */
struct {
int32_t min; /* minimum observed */
diff --git a/src/common/l1sap.c b/src/common/l1sap.c
index e150398..78938f8 100644
--- a/src/common/l1sap.c
+++ b/src/common/l1sap.c
@@ -621,13 +621,16 @@
DEBUGPFN(DL1P, info_time_ind->fn, "Rx MPH_INFO time ind\n");
- /* Calculate and check frame difference */
- frames_expired = GSM_TDMA_FN_SUB(info_time_ind->fn, bts->gsm_time.fn);
+ /* Calculate and check frame difference. On the first indication of a clock
+ * epoch (e.g. the PHY just (re)started its TDMA FN) there is no previous FN to
+ * diff against, so report no gap and account for no expired RACH slots. */
+ frames_expired = bts->gsm_time_valid ?
+ GSM_TDMA_FN_SUB(info_time_ind->fn, bts->gsm_time.fn) : 0;
+ bts->gsm_time_valid = true;
if (frames_expired > 1) {
- if (bts->gsm_time.fn)
- LOGPFN(DL1P, LOGL_ERROR, info_time_ind->fn,
- "Invalid condition detected: Frame difference is %"PRIu32"-%"PRIu32"=%u > 1!\n",
- info_time_ind->fn, bts->gsm_time.fn, frames_expired);
+ LOGPFN(DL1P, LOGL_ERROR, info_time_ind->fn,
+ "Invalid condition detected: Frame difference is %"PRIu32"-%"PRIu32"=%u > 1!\n",
+ info_time_ind->fn, bts->gsm_time.fn, frames_expired);
}
/* Update our data structures with the current GSM time */
diff --git a/src/common/nm_bts_fsm.c b/src/common/nm_bts_fsm.c
index 36aad73..5c4a534 100644
--- a/src/common/nm_bts_fsm.c
+++ b/src/common/nm_bts_fsm.c
@@ -63,6 +63,9 @@
bts->si_valid = 0;
bts->bsic_configured = false;
bts->bsic = 0xff; /* invalid value */
+ /* The PHY will (re)start its TDMA frame number on the next bring-up;
+ * treat the first PH-TIME.ind as the start of a fresh clock epoch. */
+ bts->gsm_time_valid = false;
TALLOC_FREE(bts->mo.nm_attr);
bts_cbch_reset(bts);
bts_asci_notification_reset(bts);
--
To view, visit https://gerrit.osmocom.org/c/osmo-bts/+/42854?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: newchange
Gerrit-Project: osmo-bts
Gerrit-Branch: master
Gerrit-Change-Id: I7022b0ad084a0c224f7e8c04aca0648915b1a1c6
Gerrit-Change-Number: 42854
Gerrit-PatchSet: 1
Gerrit-Owner: fixeria <vyanitskiy(a)sysmocom.de>
Attention is currently required from: laforge, pespin.
Hello Jenkins Builder, laforge,
I'd like you to reexamine a change. Please visit
https://gerrit.osmocom.org/c/osmo-bts/+/42851?usp=email
to look at the new patch set (#2).
The following approvals got outdated and were removed:
Code-Review+1 by laforge, Verified+1 by Jenkins Builder
Change subject: osmo-bts-trx: shut down on stale clock indication from transceiver
......................................................................
osmo-bts-trx: shut down on stale clock indication from transceiver
We expect the transceiver to be a reliable, monotonic clock source.
If it reports an FN far behind our local timer (elapsed_fn < 0) while
far more wall-clock time elapsed than its FN advance accounts for,
its clock has likely stalled and the indication carries a stale frame
number. Acting on it drags the scheduler backwards and re-transmits
already-sent TDMA frames, corrupting lchan-internal state(s).
Detect this and shut down the process, same rationale as the existing
"PC clock skew too high" check in trx_fn_timer_cb().
Change-Id: If787ab7ed70aa2dcb0389ceb58620c2302c3431a
AI-Assisted: yes (Claude)
Related: OS#7020, OS#6794
---
M src/osmo-bts-trx/scheduler_trx.c
1 file changed, 16 insertions(+), 0 deletions(-)
git pull ssh://gerrit.osmocom.org:29418/osmo-bts refs/changes/51/42851/2
--
To view, visit https://gerrit.osmocom.org/c/osmo-bts/+/42851?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: newpatchset
Gerrit-Project: osmo-bts
Gerrit-Branch: master
Gerrit-Change-Id: If787ab7ed70aa2dcb0389ceb58620c2302c3431a
Gerrit-Change-Number: 42851
Gerrit-PatchSet: 2
Gerrit-Owner: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-CC: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: laforge <laforge(a)osmocom.org>
Gerrit-Attention: pespin <pespin(a)sysmocom.de>
Attention is currently required from: pespin.
fixeria has posted comments on this change by fixeria. ( https://gerrit.osmocom.org/c/osmo-bts/+/42851?usp=email )
Change subject: osmo-bts-trx: shut down on stale clock indication from transceiver
......................................................................
Patch Set 1:
(2 comments)
File src/osmo-bts-trx/scheduler_trx.c:
https://gerrit.osmocom.org/c/osmo-bts/+/42851/comment/8b794935_54048ba7?usp… :
PS1, Line 586: * (elapsed_fn < 0) while far more wall-clock time elapsed than its FN
> AFAIU "elapsed_fn < 0" is not a "FN far BEHIND our local timer", but a " FN BEHIND our local timer". […]
It's actually "far BEHIND", because:
a) the outer condition:
```
elapsed_fn > MAX_FN_SKEW || elapsed_fn < -MAX_FN_SKEW
```
b) the inner condition
```
error_us_since_clk > (int64_t)GSM_TDMA_FN_DURATION_uS * MAX_FN_SKEW
```
If it's slightly behind, we compensate for that by slowing down our own clock.
If it's far behind, we bail out with "TRX clock skew too high".
https://gerrit.osmocom.org/c/osmo-bts/+/42851/comment/1286a776_c735231e?usp… :
PS1, Line 597: osmo_timerfd_disable(&tcs->fn_timer_ofd);
> why do we need to disable this timerfd here?
Well, it's the established pattern here in this file:
```
/* in trx_fn_timer_cb() */
463 shutdown:
464 osmo_timerfd_disable(&tcs->fn_timer_ofd);
465 bts_shutdown(bts, reason);
466 return -1;
467 }
/* in trx_start_noclockind_to_cb */
477 osmo_fd_close(&tcs->fn_timer_ofd); /* Avoid being called again */
478 bts_shutdown(bts, "No clock since TRX was started");
479 return -1;
480 }
```
and this is exactly what we do in the case of a "PC clock skew too high" event.
By doing so, we immediately stop transmitting DL bursts because we're out of sync with the transceiver and there's simply no good reason to keep transmitting. IIRC, in the absence of Uplink bursts the transceiver (at least osmo-trx) fills the buffer with dummy bursts, so we should be fine during ramping down (not an abrupt power off).
--
To view, visit https://gerrit.osmocom.org/c/osmo-bts/+/42851?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: comment
Gerrit-Project: osmo-bts
Gerrit-Branch: master
Gerrit-Change-Id: If787ab7ed70aa2dcb0389ceb58620c2302c3431a
Gerrit-Change-Number: 42851
Gerrit-PatchSet: 1
Gerrit-Owner: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-CC: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: pespin <pespin(a)sysmocom.de>
Gerrit-Comment-Date: Sat, 20 Jun 2026 07:03:58 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: pespin <pespin(a)sysmocom.de>
Attention is currently required from: pespin.
fixeria has posted comments on this change by fixeria. ( https://gerrit.osmocom.org/c/osmo-bts/+/42850?usp=email )
Change subject: osmo-bts-trx: reset BTS GSM time on clock (re)start
......................................................................
Patch Set 1:
(1 comment)
File src/osmo-bts-trx/scheduler_trx.c:
https://gerrit.osmocom.org/c/osmo-bts/+/42850/comment/32d0a8b5_0c85840f?usp… :
PS1, Line 499: gsm_fn2gsmtime(&bts->gsm_time, 0);
> > Don't we have the same problem on other BTS models? […]
Actually, we could reset `bts->gsm_time` on receipt of `OPSTART` for `NM_OC_BTS`! And then it does not matter if the clock starts from 0 or not. The first clock indication should cause no "Invalid condition detected ..." error.
--
To view, visit https://gerrit.osmocom.org/c/osmo-bts/+/42850?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: comment
Gerrit-Project: osmo-bts
Gerrit-Branch: master
Gerrit-Change-Id: Id8da126e460d3846a3be5bdb271553457fdd0590
Gerrit-Change-Number: 42850
Gerrit-PatchSet: 1
Gerrit-Owner: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-CC: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: pespin <pespin(a)sysmocom.de>
Gerrit-Comment-Date: Fri, 19 Jun 2026 21:19:48 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: fixeria <vyanitskiy(a)sysmocom.de>
Comment-In-Reply-To: pespin <pespin(a)sysmocom.de>
Attention is currently required from: pespin.
fixeria has posted comments on this change by fixeria. ( https://gerrit.osmocom.org/c/osmo-bts/+/42850?usp=email )
Change subject: osmo-bts-trx: reset BTS GSM time on clock (re)start
......................................................................
Patch Set 1:
(1 comment)
File src/osmo-bts-trx/scheduler_trx.c:
https://gerrit.osmocom.org/c/osmo-bts/+/42850/comment/a40f2d0c_682b1c90?usp… :
PS1, Line 499: gsm_fn2gsmtime(&bts->gsm_time, 0);
> Don't we have the same problem on other BTS models?
Probably. But the problem is that only osmo-bts-trx has an explicit epoch hook `trx_sched_clock_started()` called on `POWERON`, while the others don't expose an equivalent "clock (re)started" signal. And honestly, I don't even know if other models reset their clock/time (TDMA Fn) after power-cycling the DSP.
> Shouldn't this be put in some more generic code path?
Generic path would likely be `l1sap_info_time_ind()` in `l1sap.c`. But at this level there exists no signal/primitive about the clock state. We don't even know if the L1 is powered on or not...
--
To view, visit https://gerrit.osmocom.org/c/osmo-bts/+/42850?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: comment
Gerrit-Project: osmo-bts
Gerrit-Branch: master
Gerrit-Change-Id: Id8da126e460d3846a3be5bdb271553457fdd0590
Gerrit-Change-Number: 42850
Gerrit-PatchSet: 1
Gerrit-Owner: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-CC: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: pespin <pespin(a)sysmocom.de>
Gerrit-Comment-Date: Fri, 19 Jun 2026 21:10:57 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: pespin <pespin(a)sysmocom.de>
Attention is currently required from: fixeria.
pespin has posted comments on this change by fixeria. ( https://gerrit.osmocom.org/c/osmo-pcap/+/42838?usp=email )
Change subject: server: fix misleading data length validation log message
......................................................................
Patch Set 1:
(1 comment)
Commit Message:
https://gerrit.osmocom.org/c/osmo-pcap/+/42838/comment/28af5fd6_6757c58b?us… :
PS1, Line 14: Co-Authored-By: Claude Opus 4.8 (1M context) <noreply(a)anthropic.com>
> It's not about advertising specific tools I used while writing these patches. […]
Unless specific information is provided to showcase how exactly to reproduce finding the bug with such a tool in a deterministic way, I see no point in stating that kind of partial information which really only shows up as a free advertisement of a tool.
To me it's like adding on each commit: "I use Arch btw (TM)" because I found the bug using archlinux.
In any case, I'm not aware of any such requirements in this repo, the same way we don't require nor want a Signed-off-by in our commits. So I don't really see why should now adding a new line on each commit which really provides no direct use.
--
To view, visit https://gerrit.osmocom.org/c/osmo-pcap/+/42838?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: comment
Gerrit-Project: osmo-pcap
Gerrit-Branch: master
Gerrit-Change-Id: Ice95c1f8ad1aa8de259364bd70eba0db8918b19e
Gerrit-Change-Number: 42838
Gerrit-PatchSet: 1
Gerrit-Owner: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: laforge <laforge(a)osmocom.org>
Gerrit-CC: pespin <pespin(a)sysmocom.de>
Gerrit-Attention: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Comment-Date: Fri, 19 Jun 2026 20:52:19 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: fixeria <vyanitskiy(a)sysmocom.de>
Comment-In-Reply-To: pespin <pespin(a)sysmocom.de>