FPGA-based rapid prototyping of ASIC, ASSP, and SoC designs

by Juergen Jaeger, Synopsys Inc. , TechOnline India - October 21, 2009

There are a variety of verification options available to engineers, each with its own advantages and disadvantages. In order to achieve high simulation speeds, it is necessary to use some form of hardware-assisted verification. This article discusses techniques and considerations for using FPGA-based prototypes, which provide very high speed with low cost.

ASIC designs continue to increase in size, complexity, and cost (for the purpose of these discussions, the term ASIC is assumed to encompass ASSP and SoC devices). At the same time, aggressive competition makes today's electronics markets extremely sensitive to time-to-market pressures. Furthermore, market windows are continually narrowing; in the case of consumer markets, for example, a "typical" ASIC design cycle is in the order of 9 to 18 months, while the window of opportunity for the introduction of a product using this device can be as little as 2 to 4 months.

Failing to have a product available at the beginning of the intended market window may result in significantly reduced revenue (or a complete loss of revenue and investment if the window is missed in its entirety). These factors have dramatically increased the pressure for ASIC designs to be "right-first-time" with no re-spins. In turn, this has driven the demand for fast, efficient, and cost-effective verification at both the chip and system levels.

Alternative Verification Technologies
There are a variety of verification options available to engineers, including software simulation, hardware simulation acceleration, hardware emulation, and FPGA-based prototypes. Each approach has its advantages and disadvantages.

Software simulators have the advantages of being relatively inexpensive and of providing very high visibility into the design. The disadvantage is that they are very slow with regard to simulating a large ASIC design. Even when running on a high-end workstation, it is possible to achieve equivalent simulation speeds of only a few Hz (that is, a few cycles of the main system clock for each second in real time). Thus, the software simulation of a large ASIC could take days, weeks, or (potentially) months. Practically, this means that detailed software simulations can be performed on only small portions of the design. Also, this makes software development and hardware-software co-verification impossible to do on a simulator. However, if a "window of interest" (a problem occurring around a specific time) can be identified by some other means, the software simulator can be used to perform a detailed analysis of the entire design around this temporal window.

In order to achieve high simulation speeds, it is necessary to use some form of hardware-assisted verification (HAV), of which there are three distinct categories as follows:

  • Acceleration: Hardware-based acceleration solutions typically involve arrays of special-purpose processor chips or FPGAs. A key consideration with regard to this form of acceleration is that it is simply geared to speeding the simulation of the ASIC in isolation; that is, this form of verification does not verify the device in the context of the system. Another concern is that such an accelerator can be very expensive; and this problem is exacerbated by the fact that each unit can be accessed by only one (or very few) developers at a time. Furthermore, the achievable acceleration is a function of the ratio between test bench activity and DUT (design under test) activity and typically provides an increase of only 2x to 10x over software simulation.

  • Emulation: Hardware-based emulation solutions also typically involve arrays of special-purpose processor chips or FPGAs. The advantage of emulation (as compared to acceleration) is that these representations are integrated into the system-level environment. The disadvantage is that they can achieve simulation speeds in the order of only 1 MHz, which is at least three orders of magnitude slower than the actual ASIC hardware, and which is simply not sufficient for many verification environments. And, once again, these units can be very expensive (millions of dollars per seat) and can be accessed by only one (or very few) developers at a time.

  • FPGA-based Prototypes: In many cases, it is necessary to verify the design "at-speed." In the case of a video processing chip, for example, part of the verification may involve evaluating the subjective quality of the video output stream. The solution is to create a hardware prototype of the ASIC design using one or more FPGAs. One important benefit of this approach is the ability to run external interfaces at full speed. As a functionally equivalent version of the ASIC, FPGA-based prototypes enable both chip and system-level testing. In addition to providing real-time simulation speeds in the order of 10 MHz to 100 MHz, such prototypes are relatively inexpensive, thereby allowing them to be provided to multiple developers and also to be deployed to multiple development sites. Due to their superior performance and affordability, FPGA-based rapid prototypes are ideal as pre-silicon software development platforms. The main problem with conventional FPGA-based prototypes is lack of visibility into the design; this issue is addressed by the Confirma platform, which is discussed later in this paper.

    For the purposes of these discussions, we will concentrate on FPGA-based prototypes, which provide very high speed with low cost. {pagebreak}Multi-FPGA Implementation
    The verification of larger ASIC designs requires the use of multiple FPGAs. There are a number of considerations that have to be taken into account when it comes to taking the RTL intended for an ASIC implementation and partitioning it across multiple FPGAs. For example, ASIC-centric constructs such as gated clocks have to be translated into their FPGA equivalents; ASIC memories have to be converted into FPGA and/or on-board memories; and so forth.

    Another consideration is that it may be necessary to replicate portions of logic in order to overcome Input/Output (I/O) limitations or to achieve performance goals. Implementing these tasks by hand is resource-intensive, time-consuming, and prone to error. Furthermore, it results in two separate code streams that can lose synchronization, thereby resulting in functional differences between the FPGA-based prototype and the ASIC it is intended to represent.

    In order to address these issues, it is necessary to be able to take an existing ASIC design and auto-interactively partition it across multiple FPGAs (Figure 1). Such a tool must be capable of working with Verilog, SystemVerilog, VHDL, and mixed-language designs. Also, the tool must automatically convert any ASIC-specific constructs (including any DesignWare instantiations) into equivalent FPGA structures.

    Figure 1. A tool is required to auto-interactively partition the design.

    {pagebreak}Instrumentation and Debug
    Once partitioning has been performed at the RTL level, another tool is required to instrument the design by quickly and easily identifying any signals that are required to be sampled or used as triggers as illustrated in Figure 2.

    Figure 2. Adding instrumentation into the design RTL.

    Such a tool must be capable of inserting instrumentation and enabling debugging within the RTL source code. In addition to embedding advanced sample and triggering capabilities directly in the RTL, it should also be capable of using assertions as triggers.

    {pagebreak}Advanced FPGA Synthesis
    Traditional synthesis technology is failing to address the needs of today's extremely large and complex FPGA designs implemented in devices at the 65 nm technology node and below. The problem is that conventional FPGA synthesis engines are based on ASIC-derived techniques such as floorplanning and in-place optimization (IPO) using proximity-based timing models.

    These techniques are largely obsolete even in the ASIC world, and ASIC-derived physically-aware synthesis algorithms are simply not appropriate for use with the regular architectures and pre-defined routing resources presented by FPGAs. The end result is that traditional FPGA synthesis approaches require multiple time-consuming iterations between front-end synthesis and downstream place-and-route tools so as to achieve convergence and timing closure.

    The solution to creating a physical synthesis solution that can truly handle the complexities associated with FPGA architectures is to approach the problem from a radically different viewpoint. The way this works is to characterize all of the tracks in the FPGA " including entry points, end points, and internal exit points " and to then build a "map" of all of these tracks. In the software world this type of map is referred to as a Graph; hence the reason why this technique may be referred to as graph-based physical synthesis (Figure 3).

    Figure 3. Graph-based physical synthesis is required to handle the complexities associated with today's extremely complex FPGA architectures.

    In addition to the tracks themselves, this map also includes details as to which LUT pins have access to which types of track, any differences in input-to-output delays through each LUT, and the size and locations of any hard macros in the device. Instead of looking for proximity, the graph-based physical synthesis engine focuses on speed using an interconnect-centric approach. Starting with the most critical paths and working its way down to the least critical paths (thereby ensuring that the fastest routes are available for the most critical paths), the graph-based physical synthesis engine will select tracks and their associated entry points and exit points; from these tracks it will derive placements; from the tracks and placements it will derive accurate delays; and it will then optimize and iterate as required.

    The end result is a single-pass, push-button synthesis step requiring zero (or very few) iterations with the downstream place-and-route engines. Furthermore, based on the analysis of more than 200 real-world designs, is has been shown that graph-based physical synthesis can provide 5 to 20% performance improvement in terms of the overall clock speed of the system. {pagebreak}FPGA-based Prototypes
    Following synthesis, the optimized, placed, and routed netlists are passed to an FPGA-based rapid prototyping system as illustrated in Figure 4.

    Figure 4. FPGA-based prototypes.

    In fact, it may be necessary to use multiple platforms depending on the task being performed. One technique that facilitates front-end architectural exploration and front-end verification is the use of Transaction-Level Models (TLMs). In this case, the actions of the system are defined as high-level transactions such as "initiate a memory read" or "trigger an interrupt," as opposed to bit-twiddling RTL where the actions of every low-level signal have to be defined and simulated in excruciating detail.

    The solution-of-choice for this level of abstraction is to use an FPGA-based rapid prototyping system that features a programmable interconnect architecture for greater automation and which provides emulation-like capabilities optimized for transaction-based verification.

    As soon as the architecture of the system has been tied down and the RTL has been captured, it is important for the software developers to be able to commence work on their embedded code and application software. This requires hardware speeds, but it also requires an affordable solution that can be quickly and easily deployed to multiple software developers.

    The solution-of-choice for this level of abstraction is to use an off-the-shelf single- or multi-FPGA motherboard combined with standard and/or custom-made daughter boards. Such high-speed interface and expansion daughter boards enable prototypes to be quickly and easily customized to cover a wide range of applications.

    {pagebreak}The Confirma Verification Platform
    The answer to all of the requirements we've discussed thus far is the Confirma verification platform, which provides a tightly-integrated, easy-to-use, and comprehensive at-speed verification environment that dramatically accelerates the functional verification of ASIC, ASSP, and SoC designs (Figure 5).

    Figure 5.The Confirma Verification Platform.

    The three major components of the Confirma Platform are as follows:

  • The Certify implementation software, which takes an existing ASIC design and partitions it into multiple FPGAs. Certify also includes Synplify Premier graph-based physical synthesis, which provides rapid timing closure and up to a 5-20% timing improvement over conventional synthesis solutions.

  • The Identify Pro software, featuring TotalRecall technology. This provides designers with full visibility into complex FPGA-based ASIC/ASSP prototypes enabling them to find bugs at hardware speed and debug the cause of error in a familiar simulation environment. Identify Pro software works in a complementary manner to other verification methodologies, such as assertion-based verification and simulation, significantly improving overall productivity.

  • The CHIPit and HAPS Rapid Prototyping Systems. The CHIPit family features a programmable interconnect architecture for greater automation and provides emulation-like capabilities optimized for transaction-based verification. Meanwhile, the affordable HAPS family of rapid prototyping boards is ideal for wide deployment so as to facilitate early embedded software development.

  • Of particular interest is the use of Total Recall visibility enhancement technology. Errors in a design may manifest themselves only when the design is running with the rest of the system in a real world environment. In other cases, the bugs may occur infrequently or in a non-deterministic manner. This is a major problem for conventional FPGA-based prototyping systems due to the lack of visibility into the design, because the designer may not be able to track down and isolate a bug that appears only "every-now-and-again."

    The solution to this visibility problem is Identify Pro with TotalRecall technology (for which Synopsys holds U.S. Patent #6,904,576 — "Method And System For Debugging Using Replicated Logic"). Total Recall technology provides 100% visibility into the FPGA-based prototype while still allowing the FPGAs to run at real-time hardware speeds.

    Once a bug has been detected while running at real-time hardware speeds, users are immediately taken into their familiar software simulation environment with an initialized design and a testbench that will guide them directly to the bug.

    The advantages of the TotalRecall approach are manifold. For example, in addition to providing total access to all of the design's internal signals, users also have total access to all of the contents of the design's internal memory blocks. And, when using TotalRecall technology, even an intermittent bug that occurs deep into the verification can be easily trapped, isolated, and quickly evaluated. The end result is that TotalRecall technology provides the visibility associated with software simulation combined with the extreme real-time hardware speeds associated with conventional FPGA-based prototypes. {pagebreak}Summary ASIC/ASSP/SoC designs continue to increase in size, complexity, and cost. At the same time, aggressive competition makes today's electronics markets extremely sensitive to time-to-market pressures. There is increasing pressure for ASIC designs to be "right-first-time" with no re-spins. In turn, this has driven the demand for fast, efficient, and cost-effective verification at both the chip and system levels.

    There are a variety of verification options available to engineers, including software simulation, hardware simulation acceleration, hardware emulation, and FPGA-based prototypes. The latter offer very high speed at a relatively low cost and — in the case of Confirma FPGA-based prototypes — high visibility into the internals of the design (Table 1).

    Table 1. Comparison of verification technologies.

    The Confirma verification platform provides a tightly-integrated, easy to use, and comprehensive at-speed verification environment that dramatically accelerates functional verification of ASICs, ASSPs, and SoCs.

    The Confirma platform comprises a number of industry-leading technologies. These include the Certify multi-FPGA implementation software with Synplify Premier graph-based physical synthesis; the Identify Pro software, featuring TotalRecall technology; and the FPGA-based high-capacity CHIPit and HAPS Rapid Prototyping Systems.

    The Confirma platform is ideal for ASIC/ASSP/SoC design and verification teams who leverage FPGA-based prototypes to improve their time-to-market and to avoid costly device re-spins. By means of its programmable interconnect architecture providing emulation-like capabilities optimized for transaction-based verification; the CHIPit family of Rapid Prototyping Systems facilitates front-end architectural exploration and front-end verification.

    Meanwhile, the modular HAPS (High-performance ASIC Prototyping System) helps hardware design engineers to quickly track down the last few hard-to-find hardware bugs and it allows the software development engineers to commence their work earlier in the design cycle. Furthermore, the CHIPit and HAPS Rapid Prototyping Systems both support the integration and verification of hardware and software well ahead of chip fabrication. {pagebreak}Juergen Jaeger is director of product marketing for the Confirma rapid prototyping platform at Synopsys Inc.

    Comments

    blog comments powered by Disqus