Attention is currently required from: neels, pespin. iedemam has posted comments on this change. ( https://gerrit.osmocom.org/c/osmo-bsc/+/28165 )
Change subject: stats: new trackers for lchan life duration (v2) ......................................................................
Patch Set 3:
(3 comments)
File src/osmo-bsc/bts.c:
https://gerrit.osmocom.org/c/osmo-bsc/+/28165/comment/0946423e_ad016c65 PS1, Line 772: timespecsub(&now, &lchan->active_stored, &elapsed);
I believe more checks are needed. […]
Done
File src/osmo-bsc/lchan_fsm.c:
https://gerrit.osmocom.org/c/osmo-bsc/+/28165/comment/fa6adfdd_679bed7a PS3, Line 205: lchan->activate.concluded = true;
concluded is set to true here at the same time that active_stored is set below, so I don't see how t […]
It is indeed being set along another code path in _lchan_on_activation_failure(). So instead of me testing that flag, I'd rather test the state of the timestamp itself. It avoids reading uninitialized timestamps and, as you say, encountering a situation where they will never ever be set.
https://gerrit.osmocom.org/c/osmo-bsc/+/28165/comment/201175e0_57b88dc2 PS3, Line 519: .active_start = { .tv_sec = 0, .tv_nsec = 0 },
That's not needed, it will be erased by default.
The purpose of this was to initialize the timestamps to zero values so I can tell when they've been set to actual timestamp values. Because lchan_reset() is called from lchan_fsm_alloc() I thought it was an appropriate place to set such initial state. Maybe it's not?
This is where the wild ms duration values stemmed from. An initialized timestamp full of garbage was interpreted actual .sec and .nsec values resulting in the gigantic duration leaps.
I could add another flag like active.concluded which is only triggered on a single code path...but figured proper initialization would be cleaner.