Multi-Core -- A New Challenge for Debugging

by Jens Braunes, pls Development Tools , TechOnline India - October 14, 2008

Development of complex systems with powerful hardware on one side and ambitious applications on the other side, benefits from on system-spanning on-chip support for debugging.

A customer story
In a huge software project for an embedded application, a function behaved in a strange fashion. A variable, which must not be changed while the function is executed, was changed. The function itself did not write the variable, but rather an illegal write was caused from an-other process, an interrupt service routine or the DMA controller. To identify the culprit the source code of the function was instrumented by a costly monitoring approach. Monitor code was inserted every few lines and near branches to find the point of time the variable was overwritten. In addition, the actual challenge was to check all system activities at or right be-fore the illegal write access to find the writing process. Not a trivial task in a complex system of several cores, busses and bus masters etc. It took one week to identify the problem -- an incorrectly configured DMA controller. Other company projects based on a different hard-ware platform make use of an on-chip multi-core debug system to provide visibility to cores and bus transactions. "If we had a hardware debug system available for our project the prob-lem would have isolated within minutes" the customer told us.

What have we learned from this story? Development of complex systems with powerful hard-ware on one side and ambitious applications on the other side, benefits from on system-spanning on-chip support for debugging. In complex SoCs, just to observe and control a sin-gle core is insufficient. Rather the interactions of multiple cores, busses, and peripherals com-prising the SoC are needed when you want to detect, trace, and eliminate software problems or to profile system behavior for performance optimization.
Before determining an on-chip debug system suitable for multi-core systems , it is important to understand the user requirements.

Must each core and bus must be observable? Being able to see or reconstruct the program flow of each single core independently as well as the data flow on the system busses is critical to drawing conclusions about optimizations, interactions, bus accessing, and performance in other modes of operation.

Is it crucial for system analysis to recognize events that arise from interactions between the cores and busses? Single core operation is often insufficient, with events coming from several cores having to be considered. To address this challenge, cross triggers must be used, which combine events from different sources and make them available system-wide.

What level of complexity (in interactions between all SoC components) during debug is needed. A debug system with complex cross triggering can be difficult for a user to manage. The debugger as a user interface for the complex debug hardware must support the user in its work the find mistakes or performance bottlenecks. It has to hide the complexity which comes along with multi-core debugging. We must not forget the user's task is not to cope with the debug hardware itself but with the faulty system.

A debug system developed for SoC debug requirements developed by Infineon for their Tri-Core microcontroller family and is being extended for compatibility with the new OCP-IP debug specification. This article discusses the Infineon Multi-Core Debug Solution (MCDS) as an example of integration multi-core debug functionality to OCP compliant SoCs and shows how useful an on-chip debug hardware is for the daily business of system develop-ers.

On-chip Multi-Core Debug Goes OCP
The OCP-IP Debug Working Group (DWG) was formed in 2005 to address the definition of debug resources and integration to enable comprehensive debug of OCP based systems. In January 2008 the Open Core Protocol Debug Specification 0 was released as result of the DWG's work. The specification does not aim to define / specify debug blocks for each core or bus, but instead is a guideline of debug concepts and signal models that address a range of simple to complex debug of OCP based systems. It is up to IP vendors to follow these guide-lines and build their own debug IP blocks. One example of a complete on-chip debug solution for multi-core systems is MCDS. However before we elaborate on it the OCP-IP debut socket will be outlined in greater detail.

The OCP debug socket defines an optional OCP port, providing a debug interface, which can be added to cores and IP blocks that support or need debug access. The debug interface pro-vides sets of signals for basic debug capabilities. These basic signals can be divided in four groups:

  1. Debug control: Defines independent reset and debug enable signals.
  2. JTAG interface: Defines signals for JTAG protocol.
  3. Debugger interface: Defines a set of debug interfaces that address system level debug of run control and debugger tools interfaces.
  4. Cross-trigger interface: Defines signals for distribution of debug events and for system level control in a multi-core SoC.
In addition to these signals more extensive debug interfaces may implemented for specific IP cores. Specific functionality designed into the target system can support specific system re-quirements such as power islands, secure subsystems, etc. or provide more advanced debug features using analysis based IP blocks, for performance monitoring, time stamping etc.

To address multi-core debug control requirements the OCP debug socket defines a cross-triggering socket interface which can be implemented by all OCP debug blocks belonging to the debug environment of a multi-core SoC. The distribution of cross-triggers between all debug blocks is managed by a dedicated cross-trigger block. Conditions are monitored and compared to generate real time triggers in a cross-trigger manager implemented within the block. These triggers can then be used to control event actions such as configuration, break-points and trace collection at specific points in the system. More complex implementations can be programmed to trigger on specific values or sequences such as address regions and data read or write cycle types.

The OCP debug specification does not define external debug interfaces between an OCP on-chip debug environment and external components (probes, debuggers, etc.). These interfaces are addressed by other industry working groups (JTAG, NEXUS, MIPI, etc.).

Multi-Core Debug Solution (MCDS)
MCDS consists of configurable IP building blocks, which provide trace compression, trace qualification, time stamping, and complex cross-target triggering . It also enables measure-ment of several performance indicators in parallel with time-stamped trace results.

Figure 1 shows the MCDS sub-system consisting of MCDS kernel and on-chip trace memory (TMEM). In this example the communication between the on-chip debug environ-ment and the debug tool is established via the JTAG. It is also possible to use other interfaces like a NEXUS auxiliary port 0. Each debug target (processor core, bus) is connected to the MCDS through an adaptation logic block. The design of such a block may be target specific. Each block adapts the target's custom interface to a generic standardized interface that is used by MCDS. It also synchronizes signals from the target side to the clock domain of the MCDS in case they are in different clock domains.

1. Block diagram of the MCDS subsystem.

The specific architecture of the MCDS kernel depends on the number and type of debug tar-gets and consists of Observation Blocks (OB), a Multi Core Cross Connect (MCX) and a De-bug Memory Controller (DMC). The MCX is connected to all OBs and the DMC. It is re-sponsible for distribution of programmable cross triggers and provides a central time stamp for all trace messages. This way all trace messages from different sources can be written time aligned and in right chronological order to the trace stream. Additionally MCX provides a number of counters, which can be used to count events and trigger an action after an event occurred n-times or a certain time period elapsed. Altogether, MCX provides the functionality to observe a system with multiple processor cores, where interactions between them takes place and complex conditions have to be evaluated to recognize a certain event.

Each target signal within the SoC is connected to its dedicated Observation Block (OB). Within this block, trace qualification and trace message generation takes place. Each Observa-tion Block may contain several custom Trace Units of different types. The number and types of these Trace Units depend on the debug target and information that are derived.

To start/stop the trace recording, generate cross triggers, and control the targets, a trace quali-fication logic is implemented as shown in Figure 2.The AND/OR matrix scheme ap-plies to all trace qualification blocks contained in the OBs and in the MCX. The trigger input values can be evaluated directly or negated. Also edge or level sensitivity can be selected.

2. Trace qualifications.

From the perspective of the debug tool the MCDS has to be programmed for a certain debug / trace task by writing configuration information into a set of memory mapped registers. These registers control the AND/OR matrices of each OB, the DMC and a number of trigger sources (e.g. address/data comparators).

Engineers Daily Work
Multi-core SoCs debug environments may become very complex when a high degree of ob-servability is needed. Debug blocks for each core or bus can be programmed to monitor certain conditions or events (executed instructions within address ranges, bus transactions initiated by a certain bus master, etc.). Cross-triggers have to be programmed to distribute events between debug blocks to monitor interaction of cores or busses. The more debug blocks involved, the more system parameters that can be observed by them, the more complex the programming of the debug environment becomes. Therefore it is crucial to have powerful software support to deal with it.

Returning to the debug example in the introduction, the on-chip debug environment provided by MCDS makes it easy to find the culprit routine incorrectly writing to the variable. Following is a step by step illustration of an engineer's debug procedure. As prerequisites, the system assumes the following.

The target as a dual core SoC. One core is called TC and the other is called PCP. The PCP core is responsible for all interrupt service routines whereas the TC core does the main tasks. Both cores share the memory and transfer data via the SPB bus.

The debug environment for the dual-core SoC implements a MCDS with four debug blocks. Two blocks observe the cores. Another block monitors the activities on the SPB bus. The fourth block" the MCX " distributes all cross-triggers between the three observing blocks.

The debug tool to configure and control debug and analysis tasks is the Universal Emulation Configurator (UEC) by pls Development Tools 0. UEC was designed to support the complex pro-gramming of MCDS and can be adopted for other on-chip trace and debug environments.

3. UEC graphical trace task configuration.

Figure 3 shows the UEC graphical editor which is used to configure the trace task for the MCDS. Predefined configuration blocks can be dragged into the editor from a library, enabling a complete trace task to be assembled. As shown, some of these blocks are for defin-ing signals which becoming active when a certain condition becomes true. In this example signals for the entry address (Enter) and exit address (Leave) of the function as well as for detection the write access to the variable loopControl are used.

Other blocks help to assemble state machines -- level_0 and level_1 are the two states in this example -- and to trigger actions for trace control (record executed instruc-tions of both cores, record data transferred via the SPB to the memory location of loopControl as well as for run control. In the latter case the complete system (all cores) or only a single core can be halted depending on a single event or a more complex trig-gering condition (not used in example). The action trigger trace sets a marker into the recorded trace stream. This points to the right point of time within the trace stream, when the forbidden write access occurs. Further analysis of the recorded instructions and their corre-sponding core information around this marker finds the specific error conditions and transac-tions. An extract of the recorded trace data with the critical part is shown in Figure 4.

4. Recorded trace data: (1) the forbidden write access caused by PCP code, (2) the PCP code which caused the forbidden write
(Click this image to view a larger, more detailed version)

We have seen multi-core systems with their complex interaction between the processor cores are a big challenge for system debugging. This article has discussed standardized on-chip de-bug environments and sophisticated tool support that is crucial to cope with such challenges. Together they are the key to succeeding in addressing the complexity of finding bugs in multi-core systems.


  1. OPEN CORE PROTOCOL INTERNATIONAL PARTNERSHIP: open core protocol debug speci-fication 1.0.
  2. NEXUS 5001 FORUM: the nexus 5001 forum standard for a global embedded processor debug interface.
  3. JENS BRAUNES: easy configuration of complex on-chip emulators. In ece magazine, march 2006, pp. 26-28.
About the Author:
Jens Braunes
is software architect at pls Development Tools. Mr. Braunes received his diploma in computer science from Dresden University of Technology in Germany. INforma-tion about the company can be found at


blog comments powered by Disqus