This is merely a historical archive of years 2008-2021, before the migration to mailman3.
A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/OpenBSC@lists.osmocom.org/.
Neels Hofmeyr nhofmeyr at sysmocom.deThis is an idea about logging context, the logging filter and FSM instances. The idea is to make logging context a first-class citizen of osmo_fsm_inst and of LOGP(), in order to get cleaner FSM ids, better configurable logging and a more universal and reliable approach to logging filters. At events like 35c3 I want to, and hence any production operator wants to have useful filters for logging. The attempt with this RFC is to get this in an elegant way that also improves current logging context and saves FSM IDs. == Status quo == - Log filters: Since a long time ago, we have a logging filter mechanism which requires setting a global "context pointer" to some object, for every code path. It is not trivial to use and we mostly neglected using and testing it, so it doesn't work well in practice. See enum log_filter_index, enum log_ctx_index, log_set_context() in libosmocore and the filter_fn() in msc_main.c as an example. - Logging context: For a long time, we added logging context to each and every LOGP() line, like: LOGP(DMSC, LOGL_DEBUG, "(%s %s %d %d) Actual message", subscr->ctx1, osmo_foo_name(subscr->ctx2), trans->ctx3, another->ctx); Since the osmo_fsm_inst showed up, I have more and more used the FSM instance's id as logging context: #define LOG_MY_OBJ(my_obj, loglevel, fmt, args...) \ LOGPFSML(my_obj->fi, loglevel, fmt, ##args) my_obj_update_id(my_obj) { osmo_fsm_inst_update_id_f(my_obj->fi, "%s-%s-%d-%d", my_obj->ctx1, osmo_foo_name(my_obj->ctx2), my_obj->ctx3, my_obj->ctx); } stuff() { LOG_MY_OBJ(my_obj, LOGL_DEBUG, "Actual message"); } The advantages are obvious: - less time spent composing log message context when developing code. - context string is composed only once and re-used, less cpu cycles used to compose. - it is now made hard to forget adding useful log context. But this also has disadvantages: - Long lines Since I want to be able to trace logging output by various items, my logging context has often grown to a size that makes the log uncomfortable to read unless the terminal is really wide. For example, in osmo-msc, the context now often looks like this: DMSC DEBUG msc_a(IMSI-901700000014705:MSISDN-1001:TMSI-0x8D91D71B:GERAN-A-0:LU)[0x61200000b620]{MSC_A_ST_AUTH_CIPH}: Received Event MSC_A_EV_AUTHENTICATED (vlr_lu_fsm.c:741) It includes the IMSI, MSISDN, TMSI, the RAN conn info and the complete-l3 type. An extreme example is DCC DEBUG rtp_stream(IMSI-901700000014705:MSISDN-1001:TMSI-0x8D91D71B:GERAN-A-1:CM_SERVICE_REQ:trans-8:call-2147483649:RTP_TO_RAN:CI-BBBD5102:remote-port-not-sent:codec-not-sent:local-192-168-2-4-42000:remote-192-168-2-3-23004)[0x612000015e20]{ESTABLISHING}: Committing: Tx MDCX to update the MGW: updating remote-RTP-IP-port codec (rtp_stream.c:330) WTF. But of course, for an RTP stream, it is interesting to see what subscriber it is for, but also what callref and CC transaction, also the MGW's endpoint CI, whether it is to RAN or CN, and which RTP address and ports are involved. Now, a log line like this is pretty much guaranteed to be truncated in the display of gsmtap_log in wireshark. Very irritating. We would have to remove select items of logging context from the FSM instance ID to strike a good balance of length and usefulness. Each time we tweak it, all the logging output in test programs changes ... ugh. - Special characters Since an FSM instance can only have a limited set of characters for id, we can't add IP addresses as 1.2.3.4, but need to use 1-2-3-4. And so on and so on. - ID obfuscation If I want to address a specific osmo_fsm_inst on the CTRL interface, I need to pass the entire long id. I don't know of anyone using this in practice, but abusing the id as logging context makes this really hard. I think this is the worst side effect of me adding elaborate logging context to FSM instance ids. - log reading Various lengths of logging context make the log line's actual message show on various "indentation" levels. Instead I would like to be able to configure the log context part of a log message to show in the *end* of a log line instead of at the start, so that the short messages like "New conn" or "Rx Location Updating Request" show lined up one below each other; or alternatively introduce a common chunk of indentation spacing to line things up. That's currently not possible if both the log context and log message are just one arbitrary string fed to LOGP(). == IDEA == Reflecting on this, I formed the idea of moving logging context to a separate array of strings for each object. The most common usage would be to tie this array of context information to an osmo_fsm_inst. I'm not sure how far we could take this before we start producing too much load. It would of course be the aim to find a good balance of effort and usefulness, or think of nifty simplifications or optimizations. It's just a vague idea, but I guess it would look something like this in pseudo code: my_obj_update_id(my_obj) { set_log_context(my_obj->logctx, "IMSI", my_obj->imsi); set_log_context(my_obj->logctx, "MSISDN", my_obj->vlr->msisdn); set_log_context(my_obj->logctx, "RTP-local", osmo_sockaddr_str_name(my_obj->rtp_stream->rtp_local)); ... } set_log_context(ctx, name, val) { slot = log_context_find_or_create_slot(ctx, name); osmo_talloc_replace_str(ctx, &ctx[slot], val); /* or osmo_strlcpy() to a fixed-size char[]? */ } #define LOG_MY_OBJ(my_obj, level, fmt, args...) \ LOGP_CTX(DMSC, my_obj->ctx, level, fmt, ##args) use() { LOG_MY_OBJ(my_obj, LOGL_DEBUG, "Actual message"); /* i.e. most current log macros would be invoked in the same way */ } And then allow logging configuration like % hide all logging context in log output, except MSISDN and RTP-local: > logging print context none > logging print context MSISDN > logging print context RTP-local % show only logging lines where this exact IMSI is part of the context > logging filter-rule IMSI == 901700000014705 In gsmtap_log packets, we could add a listing of key-value items for log context, hopefully allowing to show / hide each individual context by name in a wireshark UI. Instead of repeating context like the IMSI in various FSM instance IDs, the logging context could be made chainable, i.e. internally, we could reference one logging context from another. For example, the rtp_stream FSM instance would only add the RTP address to its logging context, and then "upwards" reference the call_leg FSM's context. That would add the callref, and "upwards" reference the subscriber's context, which adds IMSI, MSISDN and RAN conn info to the rtp_stream logging context. (We might be able to do this implicitly from osmo_fsm_inst parent/child relations that we already have.) It might even be thinkable to allow traversing all across related context info, so that filtering for a given RTP port could cause even high level objects (like the vlr_subscr) to search through all childrens' log ctx information to see whether anyone defines this RTP-local context, thus show all logging for a conn that currently uses this RTP port. Premature optimization... To run each log line through a set of filters would potentially take up a lot of CPU cycles. For that I also have an idea to resolve the decision to log or not to log only once per target object and cache that result. For example, when I change the logging filter to pick a specific MSISDN, an internal token is incremented. If I hit an object's logging, I decide whether it passes the logging filter, and store the outcome as true/false with the current filter's token value in the object. Next time I hit that object and the token is still the same, the filter must also have the same result, and I can just use the stored true/false. If the user changes the filter, the token increments and on next logging, each object notices that the filter needs to be re-run in full. If the object's context info changes, reset the object's stored token to also rerun the filter on next logging. This is quite vague, especially not clear whether each object with logging context would have to store tokens for each and every log target separately... Premature optimization: require the log context names to be compile-time string literals, so that we can just compare name_a == name_b without invoking strcmp(). A general advantage is that we would always see the relevant context for logging, and we would immediately notice bugs in the context, thus we would always ensure reliable logging filters "for free". An easy way out of implementing log filters in osmocom would be to simply not have them, just provide well-defined log context in gsmtap_log, in a way that wireshark can parse. Users can then use wireshark rules to filter the log. The disadvantage here is that we would have to first compose all logging, instead of nicely skipping entire log lines before they are even composed. So, this is so far just an idea I would like to run by you. Even if the solutions might still be unworthy, I think it is worthwhile to be aware of the implications I highlighted above. ~N -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <http://lists.osmocom.org/pipermail/openbsc/attachments/20190411/4be6a3c0/attachment.bin>