Debugging a crashed program in SwiftX

Sometimes in the course of interactive debugging and testing, your target may crash and become unresponsive at the SwiftX command line. For targets that implement interactive debugging via JTAG (most ARM parts) or BDM (ColdFire parts), you can query the target’s registers and memory to get some idea of what’s happening.

This post presents a simple example and shows how to use these basic tools in SwiftX to figure out what went wrong and get you back on the path to a working project.

Sample Program

To illustrate this, here’s a little program running on an ARM Cortex-M4 core that assigns a memory walk to a background task. It starts at X0 and works its way down through memory. Eventually, it reaches an address that causes an exception.

|U| |S| |R| BACKGROUND TESTER

VARIABLE X0

: WALK ( addr1 -- addr2 )
   DUP @ DROP CELL - ;

: /TESTER ( -- )
   TESTER BUILD  TESTER ACTIVATE
   X0  BEGIN  WALK  AGAIN ;

After calling /TESTER from the command line, we use .S to display the stack, but it just hangs and doesn’t return. At this point, clicking the red stop sign in the toolbar (or selecting “Break” in the Tools menu) will break out of the cross-target link and return to the command line.

Break Out

We enter the phrase DISCONNECT TARGET HEX to leave us in a state where we’re interacting with words on the host system only with the base set to HEX for entering addresses. Then we use .R (“dot-R”) to examine the processor core’s register context.

/TESTER  ok
.S Break
DISCONNECT TARGET HEX  ok
R. CPU RUNNING
    R0 = 20000A20      R1 = 000000FE      R2 = 0000000A      R3 = 00000001
    R4 = 00000000    U/R5 = 20000A24    T/R6 = 1FFEFFFC    S/R7 = 2000091C
    R8 = 00000000      R9 = 00000000     R10 = 00000000     R11 = 00000000
   R12 = 00000000  SP/R13 = 20000C28  LR/R14 = FFFFFFFD  PC/R15 = 000018A8
   PSR = 21000003     PSP = 20000A00
                      MSP = 20000C28   ok

The Forth word .’ (“dot-tick”) is used to help find out what symbol an address is associated with. In this case, we’ll start with the PC:

18A8 .' <SPIN>  ok

The default exception handler in the SwiftX kernel is <SPIN> (a little spin loop), so any unhandled exception or interrupt will end up there.

LABEL <SPIN>   BEGIN B   END-CODE

Examine the Stack

Let’s examine the return stack to find out how we got there.

The ARMv7-M processor implements two stacks, each with its own hardware stack pointer.

  • Main stack (MSP)
  • Process stack (PSP)

SwiftX runs in thread mode and uses the PSP as its SP. When an interrupt or exception occurs, the hardware pushes the exception stack frame onto the process stack and then switches to the main stack on entry to the exception handler.

When pushing context to the stack, the hardware saves eight 32-bit words in this order:

  • xPSR
  • Return address
  • LR (R14)
  • R12
  • R3
  • R2
  • R1
  • R0

R0 is at the top of the stack (lowest memory address in the stack frame).

We can use the DUMPX memory dump to examine the target’s memory via the JTAG (or BDM for ColdFire targets) to see what the processor core is doing. We’ll start from the PSP up to the address in U, which is always the base of a task’s return stack. Remember that stacks grow downward in memory for this processor.

20000A00 20000A24 OVER - DUMPX
20000A00  20 0A 00 20 FE 00 00 00 0A 00 00 00 01 00 00 00
20000A10  00 00 00 00 35 25 00 00 10 25 00 00 00 02 00 21
20000A20  2B 25 00 00  

We can see the exception context on the stack as listed above. Of interest are the return address (2510) and the link register (2535). Bit 0 of LR is the “T” bit in this processor, so the actual address is 2534. Let’s see where those are in our name space:

2534 .' /TESTER +1C  ok
2510 .' WALK +04  ok

We can decompile both of those words to verify the addresses.

SEE /TESTER
2518   LR PUSH                          B500
251A   TESTER BL                        F7FF FFEF
251E   BUILD BL                         F7FF F80F
2522   TESTER BL                        F7FF FFEB
2526   ACTIVATE BL                      F7FE FFE1
252A   4 R7 SUBS                        3F04
252C   0 R7 R6 STR                      603E
252E   20000B24 R6 LDRI                 4E02
2530   WALK BL                          F7FF FFEC
2534   2530 B                           E7FC  ok

SEE WALK
250C   4 R7 SUBS                        3F04
250E   0 R7 R6 STR                      603E
2510   0 R6 R6 LDR                      6836
2512   R6 R7 LDM                        CF40
2514   4 R6 SUBS                        3E04
2516   LR BX                            4770  ok

The LR address 2534 is just after the call to WALK and the exception itself happened at 2510, the LDR instruction that implements the @ operation in WALK.

These should be enough clues to figure out what caused the crash, fix it, and move forward.