Hi all,
master-libosmocore was running for 20 hours on jenkins, hanging here:
> make[7]: Entering directory '/home/osmocom-build/jenkins/workspace/master-libosmocore/a2/default/a3/default/a4/default/arch/amd64/label/osmocom-master-debian9/builddir/tests'
> osmo_verify_transcript_vty.py -v \
> -p 42042 \
> -r "../tests/tdef/tdef_vty_test_config_root" \
> /home/osmocom-build/jenkins/workspace/master-libosmocore/a2/default/a3/default/a4/default/arch/amd64/label/osmocom-master-debian9/tests/tdef/tdef_vty_test_config_root.vty
> <0000> /home/osmocom-build/jenkins/workspace/master-libosmocore/a2/default/a3/default/a4/default/arch/amd64/label/osmocom-master-debian9/src/socket.c:367 unable to bind socket:127.0.0.1:42042: Address already in use
> [0;m<0000> /home/osmocom-build/jenkins/workspace/master-libosmocore/a2/default/a3/default/a4/default/arch/amd64/label/osmocom-master-debian9/src/socket.c:378 no suitable addr found for: 127.0.0.1:42042
> [0;m<0000> /home/osmocom-build/jenkins/workspace/master-libosmocore/a2/default/a3/default/a4/default/arch/amd64/label/osmocom-master-debian9/src/vty/telnet_interface.c:100 Cannot bind telnet at 127.0.0.1 42042
https://jenkins.osmocom.org/jenkins/job/master-libosmocore/a2=default,a3=de…
I've stopped the job. Right after that, a new job spawned, and it also
failed to bind telnet at 42042. It did not hang this time, but stopped
there instead:
https://jenkins.osmocom.org/jenkins/job/master-libosmocore/a2=default,a3=de…
After triggering the job once more manually, it went through.
>From a quick analysis, I can not see why this has happened in the first
place. The master-libosmocore job is set to non-concurrent:
https://git.osmocom.org/osmo-ci/tree/jobs/master-builds.yml
And the gerrit verification jobs are running on another machine. Other
than that, none but libosamocore.git of the (almost all) Osmocom
repositories that I have checked out mention port 42042, so nothing else
should bind that in theory.
Regards,
Oliver
--
- Oliver Smith <osmith(a)sysmocom.de> https://www.sysmocom.de/
=======================================================================
* sysmocom - systems for mobile communications GmbH
* Alt-Moabit 93
* 10559 Berlin, Germany
* Sitz / Registered office: Berlin, HRB 134158 B
* Geschaeftsfuehrer / Managing Director: Harald Welte
Avoiding msgb leaks is easiest if the caller retains ownership of the msgb.
Take this hypothetical chain where leaks are obviously avoided:
void send()
{
msg = msgb_alloc();
dispatch(msg);
msgb_free(msg);
}
void dispatch(msg)
{
osmo_fsm_inst_dispatch(fi, msg);
}
void fi_on_event(fi, data)
{
if (socket_is_ok)
socket_write((struct msgb*)data);
}
void socket_write(msgb)
{
if (!ok1)
return;
if (ok2) {
if (!ok3)
return;
write(sock, msg->data);
}
}
However, if the caller passes ownership down to the msgb consumer, things
become nightmarishly complex:
void send()
{
msg = msgb_alloc();
rc = dispatch(msg);
/* dispatching event failed? */
if (rc)
msgb_free(msg);
}
int dispatch(msg)
{
if (osmo_fsm_inst_dispatch(fi, msg))
return -1;
if (something_else())
return -1; // <-- double free!
}
void fi_on_event(fi, data)
{
if (socket_is_ok) {
socket_write((struct msgb*)data);
else
/* socket didn't consume? */
msgb_free(data);
}
int socket_write(msgb)
{
if (!ok1)
return -1; // <-- leak!
if (ok2) {
if (!ok3)
goto out;
write(sock, msg->data);
}
out:
msgb_free(msg);
return -2;
}
If any link in this call chain fails to be aware of the importance to return a
failed RC or to free a msgb if the chain is broken, we have a hidden msgb leak.
This is the case with osmo_sccp_user_sap_down(). In new osmo-msc, passing data
through various FSM instances, there is high potential for leak/double-free
bugs. A very large brain is required to track down every msgb path.
Isn't it possible to provide osmo_sccp_user_sap_down() in the caller-owns
paradigm? Thinking about an osmo_sccp_user_sap_down2() that simply doesn't
msgb_free().
Passing ownership to the consumer is imperative if a msg queue is involved that
might send out asynchronously. (A workaround could be to copy the message
before passing into the wqueue.) However, looks to me like no osmo_wqueue is
involved in osmo_sccp_user_sap_down()? It already frees the msgb right upon
returning, so this should be perfectly fine -- right?
I think I'll just try it, but if anyone knows a definite reason why this will
not work, please let me know.
(Remotely related, I also still have this potential xua msg leak fix lying
around, never got around to verify it:
https://gerrit.osmocom.org/#/c/libosmo-sccp/+/9957/ )
~N
--
- Neels Hofmeyr <nhofmeyr(a)sysmocom.de> http://www.sysmocom.de/
=======================================================================
* sysmocom - systems for mobile communications GmbH
* Alt-Moabit 93
* 10559 Berlin, Germany
* Sitz / Registered office: Berlin, HRB 134158 B
* Geschäftsführer / Managing Directors: Harald Welte
Hi Sylvain,
completely unrelated to the osmo-msc patch, your other patch that was merged to
osmo-bsc breaks voice channel assignment, and I had to revert it. Mentioned so
on IRC, but since you didn't reply I thought it best to also post this here:
tnt, I think you broke voice Assignment in osmo-bsc. Can't get a voice call in current master. It works again if I revert 4d3a21269b25e7164a94fa8ce3ad67ff80904aee
tnt, I'm choosing to revert this from current master now; let's fix it, test and the re-apply the patch when it is ready
https://gerrit.osmocom.org/c/osmo-bsc/+/13256
I hope I'm not annoying you by annulling all of your patches :P
It's definitely not on purpose!
~N
Hi everyone, and in this particular case Sylvain,
I know I'm hogging osmo-msc.git and I don't intend to hinder everyone else's
work, but I doubt that it was really necessary to merge the silent call channel
types patch to master right now. I'm really overloaded with doing the inter-MSC
handover work, I'm again postponing my well earned leave that I would have
liked to have started two months ago, and I don't want to be burdened with also
resolving everyone else's merge conflicts. You all know that pretty much
everything changes in osmo-msc, even if only slightly. As a rule of thumb, if
you see a 'ran_conn', 'vlr_subscr_get()', a Paging callback function or a
'gsm_trans' in a bit of code, likely there will be merge conflicts with my
current branch. Most should be trivial, but they will be stones put in my way.
So please, before you merge onto master, consider doing the same work on the
tip of my branch 'neels/ho' in osmo-msc, rather than on current master. That
would be the ideal situation for me, because then you also test my patches.
There still is ongoing work at the tip of neels/ho, if you want a more stable
point to apply modifications to, look at the commit 'add LOG_TRANS, proper
context for all transactions' and rebase your changes onto that.
If you push those changes onto a private branch, I will even see them in tig
and can simply incorporate them in by branch, carrying them along until we're
ready to merge.
If you prefer working on master, still do me a favor: just try to rebase the
patch onto neels/ho. All the conflicts that you see in such a rebase will end
up on my table and stop me from going forward until I have resolved them.
Consider that before you hit that "Submit" button on gerrit in the osmo-msc.git
repository.
Plus, if I resolve conflicts with *your* code, likely I won't grok some minor
detail and introduce bugs.
So let's work together on this. Thanks!
Working on osmo-msc.git neels/ho branch also needs various patches in other
repositories; all these branches are kept up-to-date every day and all the
time:
libosmocore neels/misc
libosmo-sccp neels/conn_id
osmo-mgw neels/endpoint_fsm
osmo-msc neels/ho
Thanks again!
~N
I'm trying to test inter-BSC Handover in ttcn3.
At first I had problems grasping the concepts, but in the end it worked pretty
nicely to start two distinct BSC handlers like this:
testcase TC_ho_inter_bsc() runs on MTC_CT {
var BSC_ConnHdlr vc_conn0;
var BSC_ConnHdlr vc_conn1;
f_init(2);
vc_conn0 := f_start_handler(refers(f_tc_ho_inter_bsc0), 53, 0);
vc_conn1 := f_start_handler(refers(f_tc_ho_inter_bsc1), 53, 1);
vc_conn0.done;
vc_conn1.done;
}
It's walking all the way through inter-BSC Handover now (!) up until the point
where I want to discard the call.
Now I'm facing the simple problem that I want to call f_call_hangup() in the
second f_tc_ho_inter_bsc1() -- but I have no cpars (CallParameters) with a
valid MNCC callref nor the CC transaction ID, those are in the first function.
How can I share cpars between those functions?
The transaction_id and callref determined by the MNCC and CC messages that
happened in f_tc_ho_inter_bsc0 need to move over to f_tc_ho_inter_bsc1, much
like the MS has moved to the other BSC.
So it would make sense to have some global struct representing the MS which
both BSC_ConnHdlr instances can access, if that is at all possible ... ?
As a bit of a weak workaround, I could inter-BSC handover right back to the
first BSC and then f_call_hangup() there :P
~N
--
- Neels Hofmeyr <nhofmeyr(a)sysmocom.de> http://www.sysmocom.de/
=======================================================================
* sysmocom - systems for mobile communications GmbH
* Alt-Moabit 93
* 10559 Berlin, Germany
* Sitz / Registered office: Berlin, HRB 134158 B
* Geschäftsführer / Managing Directors: Harald Welte
Dear Osmocom community,
starting from 16th of January (until 7th of February) we can apply
Google Summer of Code as an Open source mentor organization. Many
well-known projects, such as Debian, FFmpeg, Apache, Git, GCC,
GNURadio and others, have been participating last year.
Personally, I think it's a great opportunity to move some of our
projects forward. I would like to know your opinions about this
idea, should we participate? I would be happy to be a mentor.
With best regards,
Vadim Yanitskiy.
Hi Neels, from your IRC question today:
> when opening a new conn on SCCP, what's the proper way to get a
> conn_id? I want to feed OSMO_SCU_PRIM_N_CONNECT into
> osmo_sccp_user_sap_down(), but it seems the caller needs to pick a
> conn_id??
Whatever way you can think of to cough up a unique identifer for that connection.
> I assumed that libosmo-sccp implicitly picks an unused local conn
> reference, but that's not the case.
Note that in this sentence you're now talking about the "SCCP local
reference", which is something communicated on the wire between two SCCP
providers (implementations). Hence, it is managed inside the
SCCP provider[s] and can be seen in the source local reference /
destination local reference field of the SCCP messages.
That's *not strictly* the SCCP connection identifier which has
significance only across the SCCP User SAP (i.e. between SCCP User and
SCCP Provider on the same system), and which never is visible on any
SCCP message on the wire. It's just an implementation shortcut of the
Osmocom implementation that uses the same identifiers on both sides,
rather than allocating separate ones.
But getting back to your question: If the SCCP provider was to receive
a N-CONNECT.req without some kind of identifier, and simply allocate
one, how would that identifier be communicated back to the user? Those
primitives work asynchronosuly. You'd have to come up with
yet-another-identifier, like a "primitive tag" where that tag then would
be eacho'ed back in the N-CONNECT.resp - and you end up again having to
allocate some unique identifier :P
> sccp_scoc.c has conn_create() which seems to pick an unused id, but
> that part is static in a .c file
> hnbgw just uses 1:1 the same conn_id from RUA to RANAP and thus doesn't invent new ones
Now I'm confused. RUA isn't running over SCCP, right?
> osmo-bsc goes through its list of &bsc_gsmnet->subscr_conns to pick an unused id
> I can do that in osmo-msc, but it seems to me libosmo-sccp should have common API for that
The SCCP User SAP is modelled strictly after the ITU specs. Always
imagine yourself in a situation where the SCCP user and SCCP provider
are running in different processes and they don't have access to each
others's state - and all they can exchange are the SCCP User SAP
primitives in some serialized form. While libosmo-sccp doesn't work
like this (so far), we should always keep that in mind and keep the SAP
boundary clean.
As there's no primitive in ITU-T Q.7xx for "allocate me a local
reference", we don't have one :/
I'm not sure what we should do here. If we introduce that kind of SCU
primitive, then the questions is how are they allocated/released? Who
is in charge of that? What kind of object would the SCCP provider use
to keep track of allocated IDs for which there is no connection yet, as
the N-CONNECT.req was not yet received?
The current situation is not great. After all, theoretically there
could be an incoming new SCCP connection for which the provider choses
the same ID that the user at the same time choses for a new outbound
connection -> boom. One could use something like the highest-order bit
to distinguish between user-allocated and provider-allocated
identifiers.
Regards,
Harald
--
- Harald Welte <laforge(a)gnumonks.org> http://laforge.gnumonks.org/
============================================================================
"Privacy in residential applications is a desirable marketing option."
(ETSI EN 300 175-7 Ch. A6)
Hello Keith,
> 2) 3G data instabilities
> ( Just a quick observation note, no pcap! )
> I happened to notice that pinging a 3G handset from the network side
> (default ping: 1 sec internal, 64 ICMP bytes) keeps the connection
> "alive" and using 3G data is then a pleasant experience, IM, email,
> browsing, SIP call all working nicely. peak max download speed of 16Mbps
> was reported at one time by m.speedof.me
Can I ask you to expand a bit more on the topic? Sorry, it's been a while after
you sent the original email.
>From where did you perform a ping? From the machine where GGSN is running?
And that also means that to run 3G PS network now it is needed to make some type
of a service which will ping all the handsets registered, right? (Actually, that
is what I am going to do now while the SGSN isn't fixed by somebody or us)
Thank You!
Kind regards,
Mykola
Hi all,
Neels has recently proposed an osmo_ip_port API, see
https://gerrit.osmocom.org/#/c/libosmocore/+/13123
I'm somewhat reluctant to get this merged into libosmocore, as from my point
of view, it's reinventing what sockaddr_storage is doing in libc, but storign
the address in host byte order and string format. So I would argue we should
rather create helper/utility functions around sockaddr_storage and do any
string/binary and endianness conversions hidden by/within that API.
Irrespective of the above, I would want to hear what other developers think. Do
you think that it's worthwhile to
1) have some utility functions / infrastructure (irrespective of the data type)
1a) in libosmocore, or
1b) keep it to osmo-mgw
2) prefer to
2a) have strings for IP adresses and host-byte-order port numbers like the proposed patchset, or
2b) go with native sockaddr_storage?
If others think it should be merged, I won't try to veto it. I just want to
hear some more voices rather than just my own point-of-view.
Regards,
Harald
--
- Harald Welte <laforge(a)gnumonks.org> http://laforge.gnumonks.org/
============================================================================
"Privacy in residential applications is a desirable marketing option."
(ETSI EN 300 175-7 Ch. A6)
The new product from Lime is marked to start shipping tomorrow. As far
as I understand it is basically the LimeSDR mini with a GPS disciplined
clock and a Raspberry Pi module, so it /should/ run osmo-bts +
osmo-trx-lms fine, and without clocking issues.
It also has POE.
I was wondering if anybody else has looked at it, has comments, or has
or is thinking of ordering one?
Thanks!