Hi,
I have pushed zecke/work/msgq-reading2 with a work in progress improvement for handling a load from a stress-test. The stress test is to allocate all lchan's at the same time (then let them fail through the radio link timeout) and re-allocate them.
This forces the BTS to constantly need to open/close the logical channel resources and also send some (idle) frames (LAPDm is not established in this case). The BTS was sending frames too late and after looking at it with perf it looked like a latency issue (no one is really using the CPU).
I have implemented various changes, not all of them are ready:
* Change the Bad-Frame-Indicator level to have radio link timeouts with no phone being present. In the above test-case channels remained open as there was no bad-frame-indication on these channels.
Pro: There is a constant load for all open channels Bad: We could do this as part of the MphTimeInd but then need to add a bit to the counter to remember if the counter was modified and we need to handle bad frame in two places.
* Use writev for the write_queue. This code should/could move into the libosmocore. So for our primitives queue all primitives are the same length and we can use writev to write several frames as the same time.
1.) We go through the queued messages and add them to the iovec 2.) We use writev 3.) We do written / prim_size to see how many buffers were written 4.) We go throuh the queued messages with llist_for_each_entry_safe and remove the count of messages sent. 5.) We check if there are messages left and re-set the notify on writable bit.
* Use readv for reading from the queue. This means that we need to allocate N msgb's, call readv, dispatch the messages, free the extra msgb's.
Pro: In idle mode there is at least one message to read, with channels open it is likely we will read several messages at once. Bad: Allocating memory is not free. We could start using talloc pools, we could add a msgb_alloc_headroom that does not memset (talloc_zero) the memory.
* The above made noticable differences but I also had to get the realtime hammer. I have decided to use SCHED_RR (instead of SCHED_FIFO). This is done with osmo-pcu in mind that needs to read from queues as well and both bts/pcu should be able to inerrupt each other.
* For the ts_meas_check_compute we can stop processing the entire TS once we reach a lchan of type lchan == GSM_LCHAN_NONE. (We still need to add a memset of the lchan array into the oml.c code in case there is a re-config).
TCH/F: Only lchan[0] is used. TCH/H: Only lchan[0,1] are used. SDCCH8: All of them are used CCCH: lchan[4] is used but we will have no power measurements anyway CCCH/SDCCH4: lchan[0-3] are used.
What is not done:
* Change the l1if primitive to either use or 'steal' the msgb. All the msgb_free could be removed, only if we forward the MSGB to the PCU socket we would 'steal' it. This would allow us to avoid calling msgb_alloc and msgb_free most of the time.
* Add SCHED_RR to the pcu code as well.
* Create a talloc pool for the primitives (or re-use the msgb across multiple reads).
please review and comment holger
On Sat, May 04, 2013 at 05:16:54PM +0200, Holger Hans Peter Freyther wrote:
What is not done:
- Create a talloc pool for the primitives (or re-use the msgb across multiple
reads).
* The stress test is either showing memory fragmentation or a silent leak. The virtual size of the address space is increasing over time, the talloc report does not show an increase in memory usage and I switched off the bts before I could look at vmRSS on the 'status' file.
So it is either:
a.) Something not attached to the talloc root context b.) Memory fragmentation
holger
Hi Holger,
On Sat, May 04, 2013 at 09:23:31PM +0200, Holger Hans Peter Freyther wrote:
On Sat, May 04, 2013 at 05:16:54PM +0200, Holger Hans Peter Freyther wrote:
What is not done:
- Create a talloc pool for the primitives (or re-use the msgb across multiple
reads).
- The stress test is either showing memory fragmentation or a silent leak.
The virtual size of the address space is increasing over time, the talloc report does not show an increase in memory usage and I switched off the bts before I could look at vmRSS on the 'status' file.
does this only happen with your new patches/branch, or does it already happen in current master?
Looking at your patches, I don't see anything wrong with how msgb's are allocated and free'd
On Mon, May 06, 2013 at 08:40:04AM +0200, Harald Welte wrote:
Hi Holger,
does this only happen with your new patches/branch, or does it already happen in current master?
it appears to happen even before the readv patch. I will continue to have a look on Wednesday.
with the BFIlevel set high enough (or another fix) on the BTS side, the stress test branch of OpenBSC and this shell script the issue should be re-producable.
while true; do (sleep 1s && echo -e "enable\nallocate-all-channels\n") | telnet 127.0.0.1 4242; done
On Mon, May 06, 2013 at 10:45:08AM +0200, Holger Hans Peter Freyther wrote:
Hi,
with the BFIlevel set high enough (or another fix) on the BTS side, the stress test branch of OpenBSC and this shell script the issue should be re-producable.
it is a "genuine" leak and there are several issues.
1.) lapd_core be like the logging/msgb code and have a setter for the talloc context. See the attached diff. (I used gdb to break in malloc and typed bt and saw a ctx=0x8 passed to talloc).
2.) Consider attaching the log_info to the lapdm_channel. I say consider as currently we are embedding the lapdm_channel in the lchan struct. So this is something we can not do right now.
3.) common/rsl.c calls lapdm_channel_reset which will not talloc_free the tx_hist. lapdm_channel_exit will do that. The call to lapdm_channel_reset was introduced in 2011.
4.) lchan_activate will call lapdm_channel_init which will end up call into lapd_dl_init which re-allocates the tx_hist.
5.) We should find a way to group all VTY allocations into a single block (e.g. see if we can have a vector/cmd_desc without a named context).
I will clean this up in the coming days
holger
On Mon, May 06, 2013 at 09:39:52PM +0200, Holger Hans Peter Freyther wrote:
talloc_report_full(tall_bts_ctx, stderr);
talloc_report_full(NULL, stderr);
this is useful for debugging. so maybe we separate the talloc reporting for SIGUSR1 and SIGUSR2.
tall_msgb_ctx = talloc_named_const(tall_bts_ctx, 1, "msgb"); msgb_set_talloc_ctx(tall_msgb_ctx);
- lapd_global_init(tall_bts_ctx);
this API is inconsistent but I would prefer the library to name the context the way they want to name it.
hi andreas,
this patch is ok, you can merge it to master.
2013/5/5 jolly andreas@eversberg.eu
Holger Hans Peter Freyther wrote:
- Add SCHED_RR to the pcu code as well.
hi holger,
i ported your patch to osmo-pcu (see jolly/meas branch). if ivan agrees, i would merge it to master.
regards,
andreas