Hi, It's me again :) We deployed the first installation with token mode, and we're running into a quite nasty issue: osmo-nitb does its job for a while, but then it goes to 100% cpu and it has to be killed. This doesn't happen in the lab, obviously, but it's happening in a village of about 10.000 people. Any clue? past experiences? Thanks!
Ciaby
On Thu, Nov 20, 2014 at 11:56:59AM -0600, Ciaby wrote:
Any clue? past experiences?
The code has been written for the HAR2009 (which I didn't attend) so I have no experience with this code. Does it busy loop in one function or is the process just very busy?
E.g. attach with gdb to the running process and do the bt full thing again?
holger
On 11/20/2014 04:57 PM, Holger Hans Peter Freyther wrote:
On Thu, Nov 20, 2014 at 11:56:59AM -0600, Ciaby wrote:
Any clue? past experiences?
The code has been written for the HAR2009 (which I didn't attend) so I have no experience with this code. Does it busy loop in one function or is the process just very busy?
E.g. attach with gdb to the running process and do the bt full thing again?
(gdb) bt #0 0x00007f0b2de51663 in select () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007f0b2e794a9a in osmo_select_main (polling=0) at select.c:128 #2 0x000000000040691c in main (argc=<optimized out>, argv=0x7fff87fea568) at bsc_hack.c:360 (gdb) bt full #0 0x00007f0b2de51663 in select () from /lib/x86_64-linux-gnu/libc.so.6 No symbol table info available. #1 0x00007f0b2e794a9a in osmo_select_main (polling=0) at select.c:128 ufd = 0x7f0b2e9a52b0 tmp = <optimized out> readset = {__fds_bits = {258552, 0 <repeats 15 times>}} writeset = {__fds_bits = {96, 0 <repeats 15 times>}} exceptset = {__fds_bits = {0 <repeats 16 times>}} work = 0 rc = <optimized out> no_time = {tv_sec = 0, tv_usec = 0} #2 0x000000000040691c in main (argc=<optimized out>, argv=0x7fff87fea568) at bsc_hack.c:360 rc = <optimized out>
On Thu, Nov 20, 2014 at 06:25:34PM -0600, Ciaby wrote:
E.g. attach with gdb to the running process and do the bt full thing again?
(gdb) bt #0 0x00007f0b2de51663 in select () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007f0b2e794a9a in osmo_select_main (polling=0) at select.c:128
well that is to be expected? But if the app takes 99% time if you manually sample with continue/CTRL+C/bt full. Where does it spend the time? Alternatively you can use "perf top -p `pidof osmo-nitb`" to have a better overview.