This is merely a historical archive of years 2008-2021, before the migration to mailman3.
A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/nextepc@lists.osmocom.org/.
Harald Welte laforge at gnumonks.orgHi all, During the Chaos Communication Camp 2019 (an international hacker camp with about 5500 participants) last week, there is a tradition to operate Osmocom based 2G and more recently also 3G networks. This time I operated a nextepc based 4G/LTE network next to the camp 2G/3G networks. In order to share one subscriber database, I have implemented osmo_dia2gsup, which can translate the S6a/S6d diameter into Osmocom GSUP protocol, so nextepc can be used without nextepc-hssd but with osmo-hlr instead. The network was operating six Ericsson RBS6402 in Band 7 (2600 MHz). Some more details can be found at Regarding the nextepc side: * 2439 uniqua IMSIs were seen ** 147 unique IMSIs of CCC SIM cards (26242) ** 2292 non-CCC IMSIs ** 75 unique MCC-MNC tuples ** 34 unique MCCs ** The usual suspects (Europe + North America), but also... *** Malaysia, Indonesia, Australia, New Zealand, South Africa * 560 Attach accept (CCC SIM cards) * 46590 Attach reject (commercial operator SIM cards) * 629 PDN context (APN) activations * 235 handovers between cells (X2) * 64 crashes + restarts of nextepc-mme * 9 crashes + restarts of nextepc-pgw * 0 crashes + restarts of nextepc-sgw * 10 crashes + restarts of nextepc-pcrf In general, it worked quite nicely, and I have to congratulate Sukchan on his work at nextepc. I investigated some of the crashes, reported them to the issue tracker and attempted to fix some of them on-site. The actual codebase that was running can be found at https://github.com/laf0rge/nextepc/commits/laforge/cccamp19 >From my experience with operating such a "large" nextepc network for the first time, I have the following overall feedback, which basically boils down to three major areas: == the use of assert() == ASSERT should never be triggered by anything that is received from another network entity. So if a eNB sends an unknown S1AP-ID, or if a SGW sends an unknown TEID, or if the NAS MAC validation fails, or a EMM message cannot be decoded - all of those must be handled gracefully without terminating the program. This 'fail fast' way of programming can be done when writing code in C++ (exceptions that are caught) or in erlang (one process per message, crashing that one doesn't bring the entire MME down). I've tried my best to review all ogs_assert() in the MME and came up with the following patch: https://github.com/laf0rge/nextepc/commit/3b528af8fd51c85769123338eb57a4635c9d699e which requires https://github.com/laf0rge/ogslib/commit/dc36ccbb080038306666931bdc97f6204fd5c011 which introduces ogs_expect() and ogs_expect_or_return() macros that can be used in many places instead of ogs_assert(). It would also be possible to use this kind of 'fail fast' approach in C programs, but then one would have to use longjmp() from the 'assert', and you would have to use some kind of hierarchical memory allocator so that in the 'exception handler' you can release any dynamic allocations that were made before. == the lack of introspection == When you operate a network, it is vital to have some visibility. For the MME you want to inspect how many subscribers are currently attached, where they are attached (TAC), whether they currently have an UE Context (and at which eNB), which TMSI/GUTI was allocated, etc. Likewise, for both SGW and PGW you want to see which PDN contexts exist, from which peer IP adresses, which APN was used, what IP addresses have been allocated, etc. In the Osmocom world, we implement this introspection in two ways: * by means of the VTY interface (for the human user) * by means of the CTRL interface (for other programs) If I hadn't been busy with debugging various other issues, I would have actually attempted to add a basic VTY interface to nextepc-mmed. For sure there may be better ways to expose this state (ideally with the same piece of code providing access to both human users as well as external programs), but I'm not aware of any nice C language implementation in FOSS that one could use right away. == logging without context == When looking at log file output, it is very important that this log file output always carry sufficient context. IF there are many subscribers acting in parallel, you need to know which subscriber / pdn context / ... a given log message relates to, otherwise the log message is rather useless. For example, if you get [mme] DEBUG: [MME] Authentication-Information-Answer (mme-fd-path.c:211) then even at DEBUG level you have no indication what so ever for which particular subscriber this AIA was received. I would normally expect that the UE is resolved from the DIAMETER session-id, and then the UEs identity (IMSI) can be printed. I also find it suboptimal that log lines often span multiple lines, which means you cannot simply 'grep' for something, as you always need to check some lines before and/or after. But I guess conrary to the lack of context, this is a matter of teste and one can have different opinions about it. I'll try to contribute as much as I can regarding bug fixes and enhancements. Thanks again for all the great work so far! Regards, Harald -- - Harald Welte <laforge at gnumonks.org> http://laforge.gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6)