Approach to system testing for Osmocom stack

historical

Hi Holger,

On Wed, Jan 10, 2018 at 12:24:47AM +0000, Holger Freyther wrote:
> the lua binding code was added to be able to automate OpenBSC tests.
> In theory we should be able to do this for SMS and UpdateLocation
> (call handling with MNCC exposing is left as a todo) but in practice
> we miss a piece of software to coordinate this and run the test. We
> miss it because it is an interesting problem but also I lost time on
> switching countries, learning new tricks at a project...

Sure, I understand.  However, it is definitely a part that we're very
much looking forward to have :)

> The basic testing structure looks easy as well. We want to define the
> number of concurrent subscribers (0, 10, 100, 1000, n) and to make it
> simple a single test (UL, send SMS, t) and execute the same test for
> each subscriber and call it a success if y% of tests succeed within
> time T. The way to measure this is easy as well. The lua script would
> print some data (e.g. the name of the ms) when it starts and
> completes.

One might also think of a more structured format to return the data, but
that could always added later.  One could e.g. print a XML or JSON
snippet that's easier to parse/consume by whoever processes it.

What I also believe is very important is some kind of rate limiting /
staggering when starting up.  We know a single-BTS setup will for sure
fail lots of LU if you stat 1k MS at the same time.  So there should be
some kind of provision to say something "start 1000 MS at a rate of 10
per second".  I wouldn't go for more elaborate schemes, but simply a
single linear rate/slope.

> I am not sure if I should spawn, configure, add subscribers, a flavor
> of Osmocom cellular? I look into having some set of templates for the
> config, the stack to launch and in concept it looks awfully similar to
> something the GSM tester is doing. Shall we leave virtbts/cellular to
> the Osmocom tester and just focus on coordinating mobile? My feeling
> is to leave this to the Osmo GSM tester.

Yes, I think it's ok to focus on the "tester" side and not on the IUT
(implementation under test) side.  So we assume that the user will
somehow bring up the [virtual] cellular network before excuting the load
test.  One preferred way of doing this is - I agree - by reusing those
parts from osmo-gsm-tester.

> If we have n subscribers I would launch m copies of "mobile" (but run
> multiple MS in a single binary). 

I would argue the number of MS per 'mobile' should be configurable from
1-N.

> So with 4 MS per mobile process and 10k subs we would end with 2.5k
> processes + many log messages coming from each. 

The question is how many of those log messages do we need/want.  In
order to avoid the risk of 'mobile' blocking on writing to
stdout/stderr, I think it would be best not to pipe that into other
processes but write to files (could even be tmpfs!) and process the
files after the run?

> Would that scale with python? Should we look into doing this one in
> Go? 

> Or can some of GSM tester be used (the template part)?  

I'm not sufficiently familiar with osmo-gsm-tester to say if we can use
it.  On an abstract level, I would think the "defining resources and
generating configuration files" part should be reusable, but then it
also just uses (jinja2?) templates that anyone can use in python.  And
whether it's sufficiently scalable to generate thousands of config
files, I don't know either.

> I would probably design this concurrently with Go(besides being the
> first).

I would suggest we keep not further the number of programming languages
one needs to understand.  But then, it's "just" a tool for load testing,
so probably not that critical after all.

My naive assumption would be that starting 2.5k processes (and
processing the SIGCHILD from python should be possible without causing
a performance/scalability problem? As indicated, log file processing
could be handled later, or one could configure stdio logging to be
absolutely minimal (with verbose logs going to files)?

My attached test program (not using python 'subprocess' as I couldn't find
a way to make it do non-blocking wait for the child to terminate) runs
perfectly fine here, even without any rate limiting I get the following
on my laptop:

$ time ./subproc.py
2018-01-10 12:44:14,811 INFO     Beginning starting of processes
2018-01-10 12:44:15,603 INFO     Started 2500 processes
2018-01-10 12:44:18,607 INFO     Waited for all processes
./subproc.py  2.74s user 1.46s system 108% cpu 3.881 total

So 2500 processes could be forked in less than one second, and the
starting/reaping in python needed onyl very few seconds of system time -
compared with the amount of resources required to run the 'mobile'
programs including the GSMTAP socket traffic etc. for sure neglectable?

Now of course '/bin/sleep' is a much simpler program to start, but the
overhead of the python "orchestration" doesn't change with the resource
footprint of the program started.

Just my thoughs, as usual.  The decision is yours...

-- 
- Harald Welte <laforge at gnumonks.org>           http://laforge.gnumonks.org/
============================================================================
"Privacy in residential applications is a desirable marketing option."
                                                  (ETSI EN 300 175-7 Ch. A6)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: subproc.py
Type: text/x-python
Size: 979 bytes
Desc: not available
URL: <http://lists.osmocom.org/pipermail/baseband-devel/attachments/20180110/97c68b55/attachment.py>