The open source SWO/ITM/tracing scene has just exploded.
Every few years I have gotten that same frustration: I need to automate or repeat tracing data from a target MCU. Command line & open source tools, and not using printf. I do a sweep of the interent, and get nothing that ticks both boxes. But now, all of a sudden, it feels like there is an abundance of tools in this niche!
Example code used in this post is again with LibOpenCM3 and the STM32F411-Discovery board and my new favourite: orbuculum.
Links
- Excellent blog posting by François Baldassari. Posted only a week ago!
- The Orbuculum tool. (Orbuculum being latin for crystall ball - very apropos name)
- itm tools
- Another series of blog posts on SWO and using Orbuculum.
ITM printf
The ‘hello world’ of ITM.
Target source code
First, in the target source code, redirect file output to ITM by
implementing the _write
callback:
extern "C" {
#include <libopencm3/cm3/itm.h>
int _write(int file, char *buf, int len)
{
int i;
for (i = 0; i < len; i++)
{
while (ITM_STIM32(0) == 0);
ITM_STIM8(0) = buf[i];
}
return i;
}
}
OpenOCD conf
The ITM module can be enabled directly in the target software, or via the debugger probe configuration:
source [find interface/stlink-v2.cfg]
transport select hla_swd
adapter_khz 4000
source [find target/stm32f4x.cfg]
tpiu config internal itm.data uart off 84000000
itm port 0 on
init
reset run
Here the itm port 0 on
enables the port number 0, the one the ITM_*
macros
in the above souce write to.
TPIU clock is the ARM (STM?) “TRACECLK”, which in this device is the same as HCLK, i.e. AHB1 clock, i.e. the main CPU clock. (Getting it wrong will corrupt the ITM datastream decoding. Getting it almost correct seems to be good enough: using 82MHz instead of 84MHz does not seem to corrupt the data. Perhaps this is just because the printf data stream is slow in comparison?)
Assuming the target device has already been flashed with the ITM-printf-enabled
source code, run openocd -f <the above openocd conf>
. This will dump the ITM
datastream to the file itm.data
Orbuculum
After starting openocd, run:
$ orbuculum -b outputs/ -c 0,out,"%c" -f path/to/itm.data
-b outputs/
giving the directory where to produce results-c 0,out,"%c"
configures ITM port 0 (same number as above) to be directed to outputs/out, and printed as printf-formatted “%c”, i.e. text.-f path/to/itm.data
points to the raw ITM data stream generated by OpenOCD
Now outputs/out
is a FIFO with the printfs from target. See it with
$ cat outputs/out
Datadumping
printf()
is nice, but slow. Since ITM is just a data pipe,
data can be dumped directly without formatting. A problem here is that since it is
raw data, the beginning and end of e.g. an array doesn’t show up.
In order to do that, use a second stream for synchronization:
for (auto a: buffer) {
while (ITM_STIM32(1) == 0);
ITM_STIM16(1) = a;
}
while (ITM_STIM32(2) == 0);
ITM_STIM8(2) = '\n';
Remember to enable the new streams, e.g. in the openocd.cfg
...
itm port 0 on
itm port 1 on
itm port 2 on
...
Orbuculum runs the channel processing in separate threads, which leads to mixing of two cannels into one output not working. Instead of printing with orbuculum, use orbcat.
Start orbuculum - it automatically dumps ITM data to a TCP connection if connected
$ orbuculum -f path/to/itm.data
Then start orbcat (which automatically connects to that TCP socket in orbuculum)
$ ./ofiles/orbcat -c1,"%5d " -c2,"%c"
0 235 143 126 92 85 189 69
0 129 29 82 136 207 96 58
...
Here orbcat prints channel 1 as 5-character wide number (and a space),
and channel 2 as characters, all mixed into one. So contents
of buffer
in the above example is printed on one line.
Memory tracing
Once the above is configured correctly, memory can be traced with the DWT (Data Watchpoint and Trace unit).
Figure out the memory address of the variable to trace, and add into the openocd.cfg. Reminder: don’t take an address of a varaiable on stack and trace that. There will be noise in the output :) Unless of course the variable is on the stack of main()
...
mww 0xE0001020 0x2001ffdc
mww 0xE0001028 0x0000000d
init
reset run
This writes DWT_COMP0
with the memory address to watch, and
DWT_FUNC0
with instructions to create a hardware event whenever that
adress is written to.
(More info on where the DWT registers are in memory in the Cortex-M4 Techical Reference Manual. More info on the contentes of the DWT registers are in the ARMv7-M Architecture Reference Manual.)
Orbuculum will dump this info into fifo named hwevent
, if
the -b
option (i.e. directory to create the fifos into) is given.
Notes
Much more can be done with orbuculum - the links above include examples of loading
execution trace to kcachegrind, or ploting with graphviz. Or orbtop
for
a top-like monitor of what is executing on target.
Also, orbuculum seems to be under heavy development. I guess there will be nice new features available next time I read this :)
All the registers can be written via OpenOCD, or on target or via gdb. Check out the gdb macros that come with orbuculum sources!
Home