I'm trying to run the layer1 firmware and the layer23 program with an C123, however as soon as the firmware reports a layer 1 reset and a L1CTL_NEW_CCCH_REQ message is sent to the phone, it appears to crash and sends a new PROMPT1, causing an endless cycle of firmware uploads and following crashes.
Bisection failed due to compiler errors, using trial-and-error I found that the last instructions successfully executed are in l1a_l23_rx_cb(), case L1CTL_NEW_CCCH_REQ, just before the tpu_end_scenario() call. tpu_end_scenario() is what appears to be triggering the crash, more specifically its the tpu_enable() call in tpu_end_scenario().
Any hints how to debug this further?
For reference, this is the output of osmocon:
$ ./osmocon -p /dev/ttyUSB0 -m c123xor ../../target/firmware/board/compal_e88/layer1.bin Received PROMPT1 from phone, responding with CMD read_file(../../target/firmware/board/compal_e88/layer1.bin): file_size=35848, hdr_len=4, dnload_len=35855 ... handle_write(): finished Received DOWNLOAD ACK from phone, your code is running now!
Hello World from apps/layer1/main.c program code ====================================================================== Device ID code: 0xb4fb Device Version code: 0x0000 ARM ID code: 0xfff3 cDSP ID code: 0x0128 Dropping sample '~' ===
THIS FIRMWARE WAS COMPILED WITHOUT TX SUPPORT!!! Assert DSP into Reset Releasing DSP from Reset Setting some dsp_api.ndb values Setting API NDB parameters DSP Download Status: 0x0001 DSP API Version: 0x0000 0x0000 Finishing download phase DSP Download Status: 0x0002 hdlc_send_to_phone(dlci=5): 01 00 00 00 00 00 14 00 Received PROMPT1 from phone, responding with CMD read_file(../../target/firmware/board/compal_e88/layer1.bin): file_size=35848, hdr_len=4, dnload_len=35855 ... Received PROMPT2 from phone, starting download
and so on.
Hi Patrick,
good to have you on board!
On Thu, Mar 11, 2010 at 07:19:04PM +0100, Patrick McHardy wrote:
I'm trying to run the layer1 firmware and the layer23 program with an C123, however as soon as the firmware reports a layer 1 reset and a L1CTL_NEW_CCCH_REQ message is sent to the phone, it appears to crash and sends a new PROMPT1, causing an endless cycle of firmware uploads and following crashes.
that's something I haven't seen so far yet. Are you using the C123 that I shipped you (i.e. from the same batch as those I and others use), or was it obtained somewhere else? In the latter case, it might be some different hardware version or the like. So far I haven't yet seen any differences, but well, what do we know about what the manufacturer did.
btw: What about the l1test.bin. This one does a scan over the GSM900 band and then automatically selects the strongest ARFCN (and sends DATA INDICATIONS to the layer23).
Any hints how to debug this further?
At this point I would say try to eliminate differences to other developers, i.e. what particular toolchain are you using? Can you try a layer1.bin compiled by somebody else, just to determine if its something in the build or something related to your hardware that we're not doing right yet.
Just my 2 cents from holidays, Harald
Harald Welte wrote:
Hi Patrick,
good to have you on board!
On Thu, Mar 11, 2010 at 07:19:04PM +0100, Patrick McHardy wrote:
I'm trying to run the layer1 firmware and the layer23 program with an C123, however as soon as the firmware reports a layer 1 reset and a L1CTL_NEW_CCCH_REQ message is sent to the phone, it appears to crash and sends a new PROMPT1, causing an endless cycle of firmware uploads and following crashes.
that's something I haven't seen so far yet. Are you using the C123 that I shipped you (i.e. from the same batch as those I and others use), or was it obtained somewhere else? In the latter case, it might be some different hardware version or the like. So far I haven't yet seen any differences, but well, what do weknow about what the manufacturer did.
Yeah, its the one you sent me. I'll try a different one tommorrow.
btw: What about the l1test.bin. This one does a scan over the GSM900 band and then automatically selects the strongest ARFCN (and sends DATA INDICATIONS to the layer23).
That shows the same behaviour, after finding the first ARFCN it restarts with PROMPT1:
... Assert DSP into Reset Releasing DSP from Reset Setting some dsp_api.ndb values Setting API NDB parameters DSP Download Status: 0x0001 DSP API Version: 0x0000 0x0000 Finishing download phase DSP Download Status: 0x0002 DSP API Version: 0x3606 0x0000 Performing power measurement over GSM900 LOST!ARFCN Top 10 Rx Level ARFCN 22: -41 dBm Received PROMPT1 from phone, responding with CMD read_file(/tmp/harald/l1test.bin): file_size=37000, hdr_len=4, dnload_len=37007
Any hints how to debug this further?
At this point I would say try to eliminate differences to other developers, i.e. what particular toolchain are you using? Can you try a layer1.bin compiled by somebody else, just to determine if its something in the build or something related to your hardware that we're not doing right yet.
I'm using the gnuarm.com gcc 4.0 toolchain. I've also tried using the builds you sent me, no differences. This is what the DSP dumper shows:
====================================================================== Device ID code: 0xb4fb Device Version code: 0x0000 ARM ID code: 0xfff3 cDSP ID code: 0x0128 Die ID code: 0c933a10d0039bcf ====================================================================== Assert DSP into Reset Releasing DSP from Reset DSP bootloader version 0x0100 DSP dump: Registers [00000-0005f] 00000 : 3000 0008 0008 0008 0e0c 0e0c 181f 2900 0000 0000 0000 0060 0000 0000 0000 3fa6 00010 : 4340 005f 0813 0014 0003 0014 4099 43c0 1100 00ff 0000 8869 8869 ffa8 0000 0000 00020 : 0000 0000 0800 0000 f501 ffff 0000 0000 7fff f802 0000 0000 0000 0000 0000 0000 00030 : 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
On Friday 12 March 2010 12:06:07 Patrick McHardy wrote:
Received PROMPT1 from phone, responding with CMD read_file(/tmp/harald/l1test.bin): file_size=37000, hdr_len=4, dnload_len=37007
Hi, interesting. My conclusion would be you are jumping directly into the reset vector without having and data/instruction abort? This would mean we directly jump to 0x0 or wherever the reset vector is located at?
Debug proposals/ideas... maybe you have tried all these: - Is Hello World surviving? - If hello world is working, maybe your GSM neighborhood makes our app just crash? Can you remove the antenna or go to a basement? - The calypso has a special debug FIFO remembering the latest instruction and data fetches, prom wrote some support for it... I'm not sure it will survive the soft reset you are seeing... IIRC we are relocating the vector tables, so we should be able to dump it after a reset (in case the content survives that)?
sorry for the wild guesses, I have no idea if that is of any use.
z.
Hi,
I'm using the gnuarm.com gcc 4.0 toolchain. I've also tried using the builds you sent me, no differences. This is what the DSP dumper shows:
Just FYI, The dsp_dumper doesn't work for me either, it panics (keyboard backlight goes fade in/out). Probably because it outputs too much data too fast and panics in the console layer ...
Sylvain
On Fri, Mar 12, 2010 at 12:06:07PM +0100, Patrick McHardy wrote:
Yeah, its the one you sent me. I'll try a different one tommorrow.
any progress / result with that?
Holger and I have discussed your problems yesterday, but we really don't have an idea at this point.
Some (relatively stupid) ideas to debug it:
1) does hello_world.bin work for you at all?
2) what happens if you change the timing, i.e. add a couple of delays to see if the reboot happens after a certain fixed wallclock time or really depends on the code that is executed
3) what about using layer1.bin and the layer23 program on an ARFCN that is produced by OpenBSC + nanoBTS? This is the most frequently used setup by Dieter, Holger and myself so far. This way we can exclude the possibility that something received from the real operator GSM network in your location causes the problem
Cheers, Harald
Harald Welte wrote:
On Fri, Mar 12, 2010 at 12:06:07PM +0100, Patrick McHardy wrote:
Yeah, its the one you sent me. I'll try a different one tommorrow.
any progress / result with that?
Yes, I've tried using a different phone and it works fine.
Holger and I have discussed your problems yesterday, but we really don't have an idea at this point.
Some (relatively stupid) ideas to debug it:
- does hello_world.bin work for you at all?
Yes.
- what happens if you change the timing, i.e. add a couple of delays to see if the reboot happens after a certain fixed wallclock time or really depends on the code that is executed
I've tried disabling the tpu_end_scenario() call, which avoids the crash. My understanding is that the new settings are not activated without this, so I guess a good spot would be to add some delays directly after the tpu_end_scenario() call to see if the crash happens within the interrupt handlers?
- what about using layer1.bin and the layer23 program on an ARFCN that is produced by OpenBSC + nanoBTS? This is the most frequently used setup by Dieter, Holger and myself so far. This way we can exclude the possibility that something received from the real operator GSM network in your location causes the problem
I'll give that a try.
Thanks!
Hi Patrick,
On Mon, Mar 15, 2010 at 02:11:27PM +0100, Patrick McHardy wrote:
Harald Welte wrote:
On Fri, Mar 12, 2010 at 12:06:07PM +0100, Patrick McHardy wrote:
Yeah, its the one you sent me. I'll try a different one tommorrow.
any progress / result with that?
Yes, I've tried using a different phone and it works fine.
great news!
Then I'll simply assume the phone I sent is broken. Sorry for that. I can probably have it replaced from the vendor if I get it back. Have you ever tried to use it on a regular network (using some random SIM)?
- what happens if you change the timing, i.e. add a couple of delays to see if the reboot happens after a certain fixed wallclock time or really depends on the code that is executed
I've tried disabling the tpu_end_scenario() call, which avoids the crash. My understanding is that the new settings are not activated without this,
yes.
so I guess a good spot would be to add some delays directly after the tpu_end_scenario() call to see if the crash happens within the interrupt handlers?
sounds like a possible approach, but if your other phone is working I doubt there's much point in further debugging.
Harald Welte wrote:
On Mon, Mar 15, 2010 at 02:11:27PM +0100, Patrick McHardy wrote:
Harald Welte wrote:
On Fri, Mar 12, 2010 at 12:06:07PM +0100, Patrick McHardy wrote:
Yeah, its the one you sent me. I'll try a different one tommorrow.
any progress / result with that?
Yes, I've tried using a different phone and it works fine.
great news!
Then I'll simply assume the phone I sent is broken. Sorry for that. I can probably have it replaced from the vendor if I get it back. Have you ever tried to use it on a regular network (using some random SIM)?
Yes, that works properly. Replacing it is not necessary, one working phone should be enough for now :)
- what happens if you change the timing, i.e. add a couple of delays to see if the reboot happens after a certain fixed wallclock time or really depends on the code that is executed
I've tried disabling the tpu_end_scenario() call, which avoids the crash. My understanding is that the new settings are not activated without this,
yes.
so I guess a good spot would be to add some delays directly after the tpu_end_scenario() call to see if the crash happens within the interrupt handlers?
sounds like a possible approach, but if your other phone is working I doubt there's much point in further debugging.
Yeah, I'll leave it for now unless someone else runs into the same problem.
baseband-devel@lists.osmocom.org