TechOnline India Header
Most Popular
Top 5 Courses
  • Fundamentals of Signal Integrity
  • Fundamentals of DSP
  • Fundamentals of MOSFETs for Switching
  • Fundamentals of Multicore Processing
  • Fundamentals of Choosing and Using A/D and D/A Converters
    Most Popular
    Top 5 Technical Papers
  • Digital Signal Processing: A Practical Guide (Part 1)
  • How to Inexpensively Design an ASIC in 5 Weeks
  • Digital Signal Processing: A Practical Guide (Part 2)
  • How the Voltage Reference Affects ADC Performance, Part 1
  • Digital Signal Processing: A Practical Guide (Part 4)
    Most Popular
    Top 5 Virtual Labs
  • MC9S12NE64
  • Texas Instruments eZ430-RF2500 Wireless Development Tool
  • MC9S08QG
    Most Popular
    Top 5 Webinars
  • Mutexes vs. Semaphores: How to Use Each Properly
  • The Big Design Squeeze: How to get faster design turns in FPGA-based designs
  • An Overview of ADI's iSensor' Intelligent Motion-sensing Technology
  • Learn how to run the uC/OS-III real-time kernel on an ARM Cortex M3
    All Articles Products Courses Papers VirtuaLabs Webinars
    Top Search Items
    Software
    zigbee
    microcontroller
    LTE
    digital filter


    Techpaper Spotlight

    Wind River
    Accelerating the Development of Embedded Linux Devices with JTAG On-Chip Debugging
        Login | Register | Welcome, Guest

    Topics
    POLL
    How much code have you produced in your career?
    A few KLOC
        38%
    100s of KLOC
        46%
    Millions of LOC
        11%
    A trillion
        6%
     



    The keys to success in multicore application development
    Embedded.com
    Multi-core processors are becoming ubiquitous in embedded processing. But as these processors become more and more complex, the application developer needs to understand many important architectural details to facilitate proper partitioning of applications across multiple processing elements.

    These processing elements could include multiple heterogeneous or homogeneous CPU's as well as function acceleration blocks, and complex peripheral subsystems. Multicore processors are complex systems and require the following to be successfully adopted:

    1. System configuration and partitioning, to achieve the best overall performance of the application
    2. System virtualization, , to abstract the complexity from the developer and provide flexibility in the solution model
    3. System visualization, , to understand the system performance and profile as data flows through the cores, accelerators, peripherals, and communicate interconnect.

    Multicore in Networking Applications
    As an example, we will consider the networking space where multicore processing is growing. In the embedded networking area, a network processor is a processor which has a feature set specifically targeted at the networking application domain. These processors are software programmable devices because they are used in many different domains, including:

    * Routers and switches
    * Firewalls
    * Intrusion detection devices
    * Intrusion prevention devices
    * Network monitoring systems.

    Networking applications require both control and data plane processing (Figure 1 below). Data plane processing consists of both ingress and egress processing. Ingress processing requires high performance since packet types can be of various lengths and protocols and all packets must be parsed, classified, checks for denial of service attacks and other security checks, and possibly edited and modified in various ways.

    All this must be done at line rates (the data rate of the raw bit stream of a communication link) so performance is key to ingress processing. Egress processing is easier and mainly consists of traffic management functions.

    Control processing essentially controls the state of the network elements including route selection, capability signaling, etc. Control plane processing can be performed with standard RISC based processing elements and CPU MIPS are the key focus in this area.

    Figure 1. The Network Processing domain consists of data plane and control plane requirements

    So the key question is how to translate this user domain to the device domain? In other words, how do we partition the application onto a device that meets both the functional as well as non-functional (e.g. performance and QoS) requirements?

    System configuration and partitioning
    Embedded systems are designed for efficiency in performance, power and memory. Many embedded systems have the most significant computational requirements driven by a relatively small number of algorithms, which can be identified using common profiling techniques.

    These algorithms can then be optimized using software techniques or converted to hardware acceleration using design automation tools. The "accelerators" can then be efficiently interfaced to the offloaded processor, significantly increasing overall system performance.

    Figure 2 below is an example of an embedded processing system using these techniques. This device has eight e500 Power Architecture processing cores and acceleration blocks used to manage several important system functions including pattern matching, encryption, buffer management, queue management, and frame management.

    If you map this back to the control and data plane processing requirements mentioned earlier, then the partitioning becomes clearer. The data plane processing requirements can be mapped to the acceleration blocks and some of the CPU cores, and the control plane requirements can be performed using the remaining e500 cores. Since there are eight cores, they can be partitioned, if necessary, to perform a combination of data plane as well as control plane processing.

    But there are a number of complicating factors that make this easier said than done. For example, it may be necessary to run a light weight OS like an RTOS on the data plane cores due to performance and QoS requirements.

    A heavy weight OS like Linux may be required to control the complicated control processing on the control plane. There may even be a requirement to run two or more operating systems on a single core if we want to preserve an earlier system configuration and quickly migrate this legacy system to a Multicore processor.

    Figure 2. Multicore processor with 8 processing elements and acceleration blocks

    Embedded processing has been adopting various forms of parallelism for many years. Bit level parallelism has, of course, been addressed using larger and larger word sizes; 8 bit, 16 bit, 32 bit, 64 bit, etc.

    Instruction level parallelism (ILP) has been addressed by adding more execution units and then using the compiler to do the hard work of managing the data dependencies and scheduling the parallel instructions on the execution units.

    Data parallelism has been addressed using technologies like Single Instruction Multiple Data (SIMD) which is addressed using both hardware as well as software libraries and compiler technology to implement.

    1 | 2 | 3 NEXT >
     
     
    Latest Webinars
    · Distributor Brand Preference Study
    · Editorial Webinar: Optimized Linux Development Tools for Multicore
    · High-Power Amplifier Characterization using a Nonlinear Vector Network Analyzer
    · Completing LTE eNB Closed-loop Conformance Tests
    · Build Smart Products: Maximize return on investment through cross-discipline trade studies
     
    Member Company Spotlight
    National Instruments
     

    Multicore processors present new software challenges that must be overcome to fully take advantage of processing capabilities. Read technical white papers to learn more or view a webinar with a panel of experts.


    Member Companies

    Virtualab
    Texas Instruments

    Texas Instruments eZ430-RF2500 Wireless Development Tool