Change in osmo-bsc[master]: stats: add BTS uptime counter
gerrit-no-reply at lists.osmocom.org
Fri Apr 30 17:41:33 UTC 2021
pespin has posted comments on this change. ( https://gerrit.osmocom.org/c/osmo-bsc/+/23234 )
Change subject: stats: add BTS uptime counter
Patch Set 5:
PS4, Line 586: int downtime_seconds = BTS_DOWNTIME_SAMPLE_INTERVAL - uptime_seconds;
> Let's back up a bit maybe. […]
"Downtime is added to the stat_item. When the statsd system exports these values every X seconds, we have between 0 and X seconds of downtime in that period. Sum up all these periods and you can see total downtime for each BTS during any given timeframe."
Ok so that's the intention of hte new version of the patch. But from what I uderstand reading the code, it looks as if the first time it's indeed going to set the stat to soemthing between 0 and BTS_DOWNTIME_SAMPLE_INTERVAL, but next time this function is called, the item will again be set to 0. So from your grafana or whatever you'll only be able to see the first BTS_DOWNTIME_SAMPLE_INTERVAL of downtime AFAIU.
Now, a proposal from my side would be:
What about having a 2 stats "become_up" and "become_down" (feel free to rename it, I put them like this for you to get the idea), which count the transitions down->up and up->down.
That means, at any point of time you can now if the BTS is up, or is down (if become_up > become_down, then it's up, otherwise it's down). You can also track uptime/downtime periods by checking the timestamp of when the stat changed value. You can then see easily in a plot like grafana or using python scripts when events happened.
1- BSC starts: become_up=0, become_down=0
... a few seconds pas....
2- BTS connects: become_up=1, become_down=0
... a few hours pass ...
3- BTS disconnects: become_up=1, become_down=1
... instantaneously or even a few hours later ...
4- BTS connects: become_up=2, become_down=1
What do you think? does this proposal fullfill your needs? AFAIU it does fullfill "currently there is no way determine a BTS uptime other than by polling it via the VTY"
Do I understand correctly that your target is to put those stats in some persistent/temporary database to be able to plot and find out what's going on over time? then my proposal would work afaiu.
The only not "exact" value would be the exact time at which the event happened, where you'd have an error deviation of the number of seconds you configured osmo-bsc to push the stats update, which is usually pretty low, in the order of seconds? Even then, no events are lost, only the timing is a few seconds inaccurate.
To view, visit https://gerrit.osmocom.org/c/osmo-bsc/+/23234
To unsubscribe, or for help writing mail filters, visit https://gerrit.osmocom.org/settings
Gerrit-Owner: iedemam <michael at kapsulate.com>
Gerrit-Assignee: daniel <dwillmann at sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: daniel <dwillmann at sysmocom.de>
Gerrit-Reviewer: laforge <laforge at osmocom.org>
Gerrit-Reviewer: pespin <pespin at sysmocom.de>
Gerrit-CC: dexter <pmaier at sysmocom.de>
Gerrit-Comment-Date: Fri, 30 Apr 2021 17:41:33 +0000
Comment-In-Reply-To: iedemam <michael at kapsulate.com>
Comment-In-Reply-To: pespin <pespin at sysmocom.de>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the gerrit-log