Hi!
I've been working on a patch for OsmoBSC which generates CTRL traps from
OML Failure Event Reports as received from BTSs. This is quite obvious,
of course we want failures generate traps, right?
The problem starts with the fact that Failure Event Reports can contain
arbitrary strings ("Additional Text IE"), and that such text of course
can contain spaces.
My current patch (https://gerrit.osmocom.org/#/c/osmo-bsc/+/14177)
would produce TRAPs like this:
b'TRAP 0 bts.0.oml_failure_report "processing failure","failure ceased","TC_pcu_oml_alert"'
[where the b'' part is from python as I used osmo_ctrl.py to print the above]
As we never formally specified anything about the CTRL protocol apart from
using ',' as a separator between fields, this is a somewhat grey area.
What do you guys think?
Do you know of existing CTRL interface usage where spaces are present
in the 'value' part of the message?
Do you know of existing code that would break should we introduce support
for quoted strings with spaces inside the 'value' part?
Regards,
Harald
--
- Harald Welte <laforge(a)gnumonks.org> http://laforge.gnumonks.org/
============================================================================
"Privacy in residential applications is a desirable marketing option."
(ETSI EN 300 175-7 Ch. A6)
Dear fellow Osmocom developers,
I just would like to use this as a gentle reminder that we shouldn't be
merging new features without having automatic tests for them in place.
What comes to my mind immediately right now (but I'm sure there are other
examples) is the "subscriber create on demand" feature. I couldn't find
any tests in HLR_Tests.ttcn about this feature.
What this means is, that it will *eventually* break unnoticed, particularly
until somebody is using that feature and updating frequently to master,
which is unlikely to happen. Either you run a production system and then
you don't follow master, or you're just playing around and then that
feature is probably not something you'd be using in many cases.
What actually bugs me most about it, is that the tests should have been
written first. For most of our development [1], the existing infrastructure
in terms of TTCN3 Modules is very strong, and adding related tests is
rather quick. This means it's *very* feasible to write the tests first
and then the actual code. In fact, my own experience shows that development
is much faster this way, as maual testing of the entire stack with phones
is sloooow.
This is not to complain to Oliver or any single person, but just a general
reminder to all of us. That includes first and foremost myself as I'm
merging most of the development, but is addressed to all of the developers
and reviewers: IF something doesn't have a test, but could reasonably be
tested without spending a multitude of the development time on tests, we
should mandate that tests are available at the time of merge.
We of course cannot mandate or enforce that developers write the tests first
and follow test-driven development methodology. But I would at least strongly
encourage everyone to try that. If you first spend tons of time with manual
testing and then write a test case, it's much less efficient as you spend time
for both. OTOTH, if you first invest the time into writing the test, then
the development can make very quick progress and you save a lot of time
that would otherwise be wasted with manual testing.
Regards,
Harald
[1] I'm referring to rather self-cntained, small features. For sure,
e.g. testing inter-MSC-HO requires lots of new tsting infrastructure to be
developed, as does testing of the PCU. So yes, there are exception.
--
- Harald Welte <hwelte(a)sysmocom.de> http://www.sysmocom.de/
=======================================================================
* sysmocom - systems for mobile communications GmbH
* Alt-Moabit 93
* 10559 Berlin, Germany
* Sitz / Registered office: Berlin, HRB 134158 B
* Geschaeftsfuehrer / Managing Director: Harald Welte
Hi Neels,
in the following commit:
commit 89991fdb7c01fa42e323577b4026985e580763cf
Author: Neels Hofmeyr <neels(a)hofmeyr.de>
Date: Mon Jan 28 19:06:53 2019 +0100
you introduce language about restricting the timeout to a signed 32bit value,
as time_t is not well-defined on 32bit systems.
What I'm somehow missing is where we are using time_t in this context? Neither
osmo_fsm code nor the underlying osmo_timer_list seems to be using time_t.
So why would we bother about time_t here?
Thanks for sharing your thoughts.
Regards,
Harald
--
- Harald Welte <laforge(a)gnumonks.org> http://laforge.gnumonks.org/
============================================================================
"Privacy in residential applications is a desirable marketing option."
(ETSI EN 300 175-7 Ch. A6)
Hi all!
As some of you know, I'm currently using libosmocore, and specifically
osmo_fsm inside some cortex-m microcontroller projects. One of the
features I need there: FSM timeouts below 1s.
osmo_fsm uses osmo_timer_list as underlying timer, and that timer can
express any timeval (seconds + microseconds) as timeout. Only the osmo_fsm
API doesn't expose that part.
What I could now do: Simply add osmo_fsm_inst_state_chg2 which takes one
more argument for microseconds. However, I find that "second, microsecond"
style with two arguments everywhere quite clumsy.
So what I'm instead suggesting is to add new API that use one single timeout
value (like the current API), but specify the timeout in milliseconds. The
old API then becomes a wrapper around the new API, simply multiplyin timeouts
by a factor of 1000.
Does anyone think this is too restrictive? I currently cannot think of use
cases where timeouts below one 1ms or with granularity below 1ms matter *and*
where one would want to use osmo-fsm. But given how speeds of systems (both
processors and communications systems) are increasing, it might be that we'd
eventually need that? I'd currently assume osmo_fsm with all of its internal
logging, etc. is too heavy-weight for such super-time-constrained use cases.
And in terms of value range: Assuming a 32bit architecture, the scale of
a 2^32 microsecond value is sufficient to express timeouts up to 1193
hours, which in turn is about 49 days.
Any comments, ideas, thoughts?
Regards,
Harald
--
- Harald Welte <laforge(a)gnumonks.org> http://laforge.gnumonks.org/
============================================================================
"Privacy in residential applications is a desirable marketing option."
(ETSI EN 300 175-7 Ch. A6)
Hi all,
Could anyone advise me which solution should I look for feeding the
LimeSDR mini with external clock GPSDO locked?
Btw, which you think it's a better option for a home brew BTS: the
LimeSDR mini with external clock plus an intel mini-pc, or the LimeNet mini?
Cheers,
Rafael Diniz
Looking at https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test-sccplite/…
looks quite horrible since build 357.
However, with my own manual tests, all of those pass.
It looks like most of those failures are sporadic, and I cannot reproduce them.
I am not sure how to find out what is going there, I can just say that osmo-bsc
looks quite stable AFAICT and that it seems to be non-determinism / timing in
the ttcn3 and/or system load causing the failures.
~N
I think here is a bug:
char *osmo_quote_str_c(const void *ctx, const char *str, int in_len)
{
char *buf = talloc_size(ctx, OSMO_MAX(in_len+2, 32));
if (!buf)
return NULL;
return osmo_quote_str_buf2(buf, 32, str, in_len);
}
We may allocate more than 32 bytes (see OSMO_MAX()) but still allow to write
only 32 bytes?
Looks like the allocated len should be stored in a local variable to pass to
osmo_quote_str_buf2().
And if I'm right, what is the 32 for? At least 32??
~N
Dear all,
as you can see at
https://jenkins.osmocom.org/jenkins/job/ttcn3-bsc-test-sccplite/test_result…
there has been a large increase in test failures of the SCCPlite related tests in
osmo-bsc over the last two builds/tests.
Does anyone know what kind of changes they made which could have impacted the
relted behavior?
We don't see any such failures on AoIP, leading me to suspect that some changes
were tested only for AoIP but not for SCCPlite?
Regards,
Harald
--
- Harald Welte <laforge(a)gnumonks.org> http://laforge.gnumonks.org/
============================================================================
"Privacy in residential applications is a desirable marketing option."
(ETSI EN 300 175-7 Ch. A6)
Dear Osmocom community,
A question about this part of code - function sgsn_ggsn_ctx_drop_pdp:
http://git.osmocom.org/osmo-sgsn/tree/src/gprs/gprs_sgsn.c#n720
The second branch of the condition (hard-dropping) is called even when the phone
is registered, and hence no Deactivate PDP Context Request is sent to the phone.
Due to that, the phone doesn't know that the PDP Context was deleted on the
network side and keeps acting as if it is still active -> the PS isn't working
when this happens.
Any suggestions/thoughts on how this can be fixed?
Thanks
Kind regards,
Mykola