In the nanometer technology used for automotive SoCs, most defects on silicon are due to timing issues. Thus, at-speed coverage requirements in automotive designs are stringent. To meet these requirements, engineers expend a lot of effort to get higher at-speed coverage. The principle challenge is to achieve silicon of the desired quality with high yield at the lowest possible cost. In this article we discuss the problems associated with over-testing and under-testing in at-speed testing, which can result in yield issues. We will provide a few suggestions that can help to overcome these problems.
The primary objective of at-speed testing is to detect any timing failure that may occur on silicon at its operating frequency. The most important part to be tested is the logic that generates controllable clock pulses having the same frequency as required for functional operation. The preferred way to supply controlled clock pulses is from the tester (ATE) through the input pads, as this will reduce complexity and minimize the additional test logic that needs to be built over the design.
However, this scheme will have frequency limitations because pads generally cannot support very high frequency clocks. So on-chip phase-locked loops (PLLs) and oscillators are used to provide clock pulses. Free running clocks from these sources cannot be used directly, however, because first we have to shift vectors through scan chains at slow frequency (shift frequency), capture at functional frequency, and then flush out data at shift frequency. We need controllable pulses while capturing at functional frequency, which can be achieved by using the chopper logic. A typical clock architecture with at-speed clocking is shown in Figure 1.
Figure 1: A typical clock architecture with at-speed clocking
For any SoC, STA (Static Timing Analysis) sign-off is integral to validating the timing performance. Timing sign-off ensures that the silicon will operate at the desired functional frequency. The same logic applies to at-speed testing as well. STA signoff must be done for at-speed mode along with the functional modes because the clock path might be different in at-speed mode, and added test control logic needs to be timed as well. The chopper logic is not required in normal functional mode, so we need to meet the timing requirements of the chopper logic as well.
Ideally speaking, closing timing in at-speed mode should not be a problem if the change in clocking is done in the common path, such as at the start of the clock path, so that the change is common for both launching and capturing flops, and hence does not affect setup and hold timing of the design. The test control logic generally works at slow frequency or is static and hence not very difficult to meet timing.
Typical SoC clocking scheme
However, modern Soc designs are not that simple. High performance and low leakage requirements result in the designs having various clock sources within a single SoC, such as PLLs, oscillators, clock dividers, etc. Depending upon the architecture, there can be a number of IO interfaces operating on an external clock running at a few MHz, such as SPI, JTAG, I2C, etc. As a result, different parts of the SoC can operate at different frequencies.
Here’s where things get complex. The clocking solutions (chopper logic) discussed earlier for at-speed clocking are not sufficient for complex chips operating at different frequencies. In at-speed testing, these complexities raise problems known as Under-testing and Over-testing, which then lead to the need for optimal testing.
Over-testing happens when logic is tested at a higher frequency in at-speed mode compared to the frequency of operation in functional mode. Referring to Figure 2, over testing happens if a pll_clock is provided to any low frequency modules like watchdog and RTC during at-speed mode. The one key reason for such an approach is simplicity of the test clock path, as this approach will require only minimal change in functional logic. In our example, we just need to bypass all divided clocks/RC osc clocks/external clocks by scan clock which in turn will be controlled by the pll clock.
Figure2: Memories and flash are operating on a divided PLL clock while the platform is working at real pll_clock. The internal RC oscillator is feeding clocks to blocks like the RTC (Real Time Counter) and watchdog timer, which require very slow frequency clocks. Blocks like display masters have both an IPS interface and a camera interface. The IPS interface generally works at system frequency while camera logic works at a slower frequency clock provided from the outside world. IO interfaces like SPI and JTAG work on a few MHz. Thus the overall configuration of an SOC requires multiple blocks working at multiple frequencies.
Under-testing happens when any logic is tested at slower frequency in at-speed mode than the intended frequency of operation. This scenario generally exists when it is not possible to supply a test clock of the exact frequency as in functional mode, but at the same time closing design at high frequency is not possible either due to large data path delays or technology constraints. In this case we are forced to supply a clock of lower frequency.
Thus it is necessary to test the silicon for defects at exactly the same frequency as the functional frequency. Any deviation will lead to issues of either over-testing or under-testing:
- Closing the designs on higher frequencies for at-speed testing, when functional logic is intended to work at slower frequencies, will affect area and power of the overall design. In case of timing critical designs, the at-speed testing tool will use high drive strength cells and even may require low Vt cells to meet these frequency targets.
- Even if the timing of the design is closed at higher frequency, at the cost of power and area, we could be unnecessarily pessimistic in our yield calculation. There can be unrealistic yield fallout during at-speed testing. For example, in a design with two clock domains, domain1 @ 120MHz and domain2 @ 80MHz, we close timing at 120MHz flat for the whole design to simplify clocking architecture in at-speed mode. All the ATPG pattern generation for both these domains will happen @ 120MHz. Due to process variability, on silicon, domain1 is working fine at 120MHz but domain2 is working at 110MHz only, thus all the dies will be treated as defective parts. Though the chip is good enough for functional requirements, based on at-speed pattern failure we will declare the die as a faulty one and this will reduce our yield
- In the case of under-testing, at-speed patterns will not guarantee that the chip will actually work at the intended frequency. Since bad dies can pass at-speed tests, the original purpose of at-speed testing to filter out bad dies can be defeated. In this case we will be over-optimistic in our yield calculation.
Having understood the drawbacks, we will focus on the reasons for the presence of over-testing and under-testing in any SoC:
Simplicity of clock architecture
Given so many clock sources in the functional mode, the easiest way is to provide few controllable test clocks in at-speed mode.
Figure 3: The easiest and simplest test clock solution is to mux the PLL clock with the external clock even for at-speed mode, a case of over-testing.
Let us take an example of a DSPI module. The IP works on 2 clocks, an external clock of 15 MHz and a functional PLL clock of 120MHz for internal logic. As shown in Figure 3, the easiest and simplest test clock solution is to mux the PLL clock with the external clock even for at-speed mode, a case of over-testing.
Frequency dividers in design
In case of clock dividers, the original source clock is used in all the test modes and is muxed with divided clock as shown in below, Figure 4.
Figure 4: The original source clock is used in all the test modes and is muxed with divided clock
This is a common scenario in design where we have lot of dividers but we can’t use them in at-speed testing because these dividers are not controllable during testing (phase determination). So the easiest approach to simplify things in at-speed testing is to provide an undivided clock in test mode, which results in over testing.
Timing exceptions such as multicycle paths
In designs, various timing exceptions in the form of multicycle paths, false paths, case analysis, etc. are used when the signal propagation requires more than a single-cycle clock during functional operation. These exceptions are valid in at-speed mode as well and hence should be appropriately ported in at-speed mode, also in the form of SDC (standard design constraint file). However current ATPG tools have limitations in understanding some of these constraints, especially multicycle paths. When parsing through the SDC file, it ignores multicycle paths and does not create any pattern for that. For example, if we have a multicycle of 2 from one register to another register, it will simply mask any pattern that checks capture between these two registers.
So this means all the multicycles paths are not tested in at-speed testing, resulting in under-testing. On the other hand, if these exceptions are not made part of the SDC file, the timing check will happen in a single clock cycle, whereas functionally this path will work in two clock cycles, which is a typical case of over-testing. Overall it is a big concern as generally we have lots of multicycle exceptions in any complex design, which can lead to either over-testing or under-testing if we follow conventional methods.
Optimal At-speed testing
So far we have seen that both over-testing and under-testing are not desirable, so we need a methodology to ensure that actual benefits of at-speed testing are realized without compromising design QOR. The idea is that there should not be any significant area/power overhead on design due to at-speed testing but at same time we should ensure that in at-speed testing design is checked at intended functional frequency and not more or less than that.
Listed here are a few guidelines/techniques to ensure that at-speed testing is done in the correct way:
* Identify the different frequency domains in your design in func mode. This is an important step because the earlier you identify the frequency domains, the better you will know at-speed testing requirements. Thorough analysis of the clock architecture can help define optimal at-speed clocking. For example, when starting a project, generally not much emphasis is given to external IO interface frequency targets, which can later impact defining at-speed clocking strategy for these interfaces.
* Define at-speed mode timing constraints along with functional mode constraints generation. Any timing critical paths in at-speed mode can be addressed at the start of the design cycle itself. Making changes at early stages is always easier.
* One of the key solutions is to identify cases of under-testing and over-testing because many times these issues will crop up during final STA runs or even when they are unnoticed in case timing is met. Using some sort of scripting, comparison of max frequency of all the registers in functional mode and at-speed mode can be done. Divide the registers into three categories: flops having same frequency in both modes – good to go; flops having frequency less in at-speed mode than in func mode – case of under testing; flops having frequency more in at-speed mode than in func mode – case of over testing.
Once these cases are identified, thorough analysis should be done and all the architecture possibilities should be explored to provide clock of exactly same frequency as in functional mode.
To tackle timing exceptions, the solution lies in converting a multicycle path to a single-cycle path by testing at lower frequency. The concept is simple. Suppose a design works at 200MHz and has a few paths of a multicycle of 2. Timing these paths at 200MHz with a multicycle of 2 is equivalent to testing these paths at 100MHz in a single cycle. In at-speed testing, test the logic in two passes. In the first pass a capture clock of 200MHz will be provided to test all single-cycle paths and all multicycle paths will be masked. In the second pass, capture clock of 100MHz will be provided to test all multicycle paths only. The same concept can be applied with higher multicycles.
Doing at-speed testing in multiple passes will also solve the problem of over-testing/under-testing to a large extent. As we discussed above, sometimes it is not possible to provide exact frequency clock to all domains simultaneously, but we can do so by configuring capture clock at multiple frequencies in each pass. Generally, in designs where we use pll for the system clock, we have the flexibility to configure pll at some discrete frequencies.
The same approach of testing in multiple passes can be used to address the problem of over-testing associated with frequency dividers. The difference is that in the case of multicycle paths, the flops can be masked, but in the case of dividers, we need controllability of divided clocks so that the clock can be gated in scan mode.
Figure 5: During at-speed testing in pass 1 when system clock is at 200Mhz, clock gating logic will gate the clock to the domain that is functionally operating at 100Mhz (through divide by 2 logic).
As shown in Figure 5, during at-speed testing in pass 1 when system clock is at 200Mhz, clock gating logic will gate the clock to the domain that is functionally operating at 100Mhz (through divide by 2 logic). But in pass 2 when system clock is set at 100Mhz, enable of clock gating cell will be driven to logic 1. This will ensure that logic is now tested at the intended frequency of 100Mhz.
Using the above guidelines, most of the issues related to over-testing and under-testing should be resolved. But in case one is forced to choose between under-testing and over-testing, the decision depends on the application of the SoC. Automotive designs, where human safety is the prime concern, the safe approach of over-testing should be chosen, whereas in a power-hungry design, such as in networking and wireless domains, go for under-testing. Even for these cases, every effort should be made to ensure that the deviation from the desired frequency should be as little as possible.
Let us take one example of a design with multiple frequency clocks, which will help to understand the concept:
Suppose a design is working on 240MHz and we have a multicycle of 2, 3, 4 etc. for various paths. Also there are some interfaces working on external clocks of 10MHz and 60MHz. To avoid any kind of over-testing or under-testing, test it in a number of passes in at-speed mode. Configure PLL at 240, 120, 80, 60MHz and test all logic at actual functional speed.
* 1st Pass: @240MHz- all single cycle paths (mask 100MHz and 60MHz interface, rest SDC is standard one)
* 2nd Pass: @120MHz – Paths with multi-cycle of 2 (remove multi-cycle 2 exceptions from SDC) + 100MHz interface logic (over-testing minimized)
* 3rd Pass: @80MHz – Paths with multi-cycle of 3 (remove all multi-cycle exceptions of 2 and 3)
* 4th Pass: @60MHz – Paths with multi-cycle of 4 paths + 60mhz interface logic (remove all multi-cycle exceptions)
With SoCs becoming functionally more complex and technology moving to lower nodes, good yield is proving to be an important concern for any design company. Yield directly impacts the profit-loss equation and serious efforts are required to fix the reasons for low yield. At-speed testing is an important yardstick to measure the quality of silicon and thus we should target high coverage numbers. But at the same time, we should run our at-speed patterns at appropriate clock frequency only, because testing at wrong clock frequency will lead to problems of overtesting or undertesting. Both these conditions (of under-testing and over-testing) are neither good for design QOR nor for yield estimation.
Test clocking should be given equal importance to functional clocking and an effort should be made to provide the same frequency clock to different domains as they are clocked in functional domain. At the same time considerable attention should be given to multicycle paths because typically they constitute a significant component of timing paths in any design. At-speed testing in multiple passes is the solution to multicycle paths at-speed testing. Multiple passes testing methodology can also be used to solve other cases of over-testing and under-testing as well. So in conclusion, using the above recommendations and methodologies we can achieve more accurate at-speed coverage, which in turn will ensure that we have minimum yield fallout issues without compromising design QOR.
About the authors:
Rajiv Mittal works at Advanced Micro Devices (AMD) as Senior Member of Technical Staff in ASIC/Layout Design Team in India. Earlier he worked at Freescale Semiconductor as Staff Design Engineer. With more than 11 years of experience he has worked in a wide range of SoC and ASIC design domains, mainly in physical design activities across a number of process technology domains, ranging from 130nm to 40nm.
Amol Agarwal has worked at Freescale Semiconductor as Senior Design Engineer for five years, on the physical design team where he has been involved in several block-level and chip-level designs in technology ranging from 250nm to 55