This is merely a historical archive of years 2008-2021, before the migration to mailman3.
A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/OpenBSC@lists.osmocom.org/.
Harald Welte laforge at gnumonks.orgHi Neels, On Sun, Mar 10, 2019 at 04:31:05AM +0100, Neels Hofmeyr wrote: > Avoiding msgb leaks is easiest if the caller retains ownership of the msgb. ACK. > However, if the caller passes ownership down to the msgb consumer, things > become nightmarishly complex: ACK. > If any link in this call chain fails to be aware of the importance to return a > failed RC or to free a msgb if the chain is broken, we have a hidden msgb leak. Indeed. This is why I have been arguing in favor of something like a "packet dispatch talloc context" for some time. Unfortuantely I didn't have any time to work on this so far. The proposal is along those lines: * we create a talloc context every time we drop out of the select() statement * any msgb's allocated from routines that read from sockets / etc. are allocated from that context * just before we return into osmo_select_main() [or in the start of that function], we free the entire talloc context, which will free any msgb's in it * if anyone wants to take ownership of a msgb for a longer time (e.g. enqueue it somewhere for delayed processing, ...) they need to re-parent the object away from the volatile "packet dispathc talloc context" and move it into another context. Talloc provides talloc_steal() or better talloc_move() for it. I would think this is an elegant solution. That particular volatile talloc context would not be restricted to msgb's but could be used for any kind of 'temporary' allocations. We could lazily allocate whatever we wanted (like temporary string buffers, etc.) and know that it will be safely released once the current select dispatch ceases to exist. Sure, whenever we can we should still use the stack, as stack allocations are cheaper than heap allocations. > This is the case with osmo_sccp_user_sap_down(). In new osmo-msc, passing data > through various FSM instances, there is high potential for leak/double-free > bugs. A very large brain is required to track down every msgb path. Let me know what you think of the above-mentioned proposal. We could have something like msgb_alloc2() which takes an additional talloc context as argument. Pluse creating that [gloal variable] talloc context in the beginning of osmo_fd_disp_fds() and talloc_free()ing it at the end of osmo_fd_disp_fds() should do the trick. One might evene be a able to do some trickery with talloc_set_destructor() to print a BIG FAT WARNING/ERROR message in case anyone ever tries to free memory allocated from this context manually. That destructor would simpl have to be removed or disabled when the select loop handling takes care of it. > Isn't it possible to provide osmo_sccp_user_sap_down() in the caller-owns > paradigm? Thinking about an osmo_sccp_user_sap_down2() that simply doesn't > msgb_free(). It's possible. But I think there are use cases for both behaviors. > Passing ownership to the consumer is imperative if a msg queue is involved that > might send out asynchronously. (A workaround could be to copy the message > before passing into the wqueue.) The proposed approach would allow for "re-parenting" without having to copy the msgb or any of the metadata. Even the address of the msgb stays the same if you move/steal it. > I think I'll just try it, but if anyone knows a definite reason why this will > not work, please let me know. Not off my head, but I guess nobody is able to make guarantees on this without a thorough code audit and/or extensive testing. THat's why I like the idea of this volatile context. However, I realize only now that you're referring not to the case of received message buffers, but about locally-originated msgbs. But for them, too, the rule could be simply something like this: * allocate memory from the volatile context * pass it into any library function * if the library function transmits it right away, no need to free/steal * if the library function enqueues it somewhere and doesn't immediately send it over a socket or the like, it needs to move/steal it. Thinking of the above, I think the 'being queued' case is more or less the normal one, as we can never make the assumption that the underlying transport layer is able to accept transmission of any data in a non-blocking way. So there will always be some kind of write queue wherever we transmit. The only situation where we can "normally" assume that the msgb of any outbound message can be free'd immediately is: * datagram sockets like for RTP * local primitives that are simply sent to another layer in the local stack, and which are released immediately after that stack has received them Regards, Harald -- - Harald Welte <laforge at gnumonks.org> http://laforge.gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6)