Relieving the pain of parasitic extraction
The future is here: phone, web browser, email, photo and video, all in one device at your finger tips, simultaneously. The evolution of IC design is in part driven by the demand for more memory with higher performance. Advanced process technologies enable more functionality, higher performance, and portability in chip design through smaller device sizes (Figure 1). These innovations pose interesting design challenges, which include new parasitic extraction issues that are affecting nanometer memory designs.
Figure 1: Minimum device feature size trends.
Embedded memories already dominate most of an SoC’s die area. The ITRS roadmap  predicts that memory functions—the number of bits on a single chip—will double every two years. Performance is the key requirement for leading-edge memory design. Memories must meet exacting specifications for fast data transfer and low power consumption.
Areas of concern for memory design at nanometer nodes include storage capacitance with reduced feature size, low resistance for bit and word lines to ensure desired speed, improved bit density, and lower production cost . Higher densities increase the interactions between interconnect lines, as well as those between interconnects and devices. New device and interconnect design techniques used at 28nm introduce more complex coupling effects that are difficult to extract accurately. Geometries are becoming more three dimensional in structure, and circuits are more sensitive to 3D parasitic effects.
Transistor-level simulation including very accurate parasitic effects is critical to the memory design process to help designers converge on an optimal design that has a high certainty of meeting targeted specifications, without costly overdesign.
More accurate solutions are required that do not increase cycle time and that fit into existing design flows. New 3D parasitic extraction technology delivers attofarad accuracy as well as high performance and capacity, at all stages of memory design, from bit cell design to full chip sign-off, ensuring a robust design that will work to specification when it is manufactured.
Limitations of current parasitic extraction technologies for memory
Traditional parasitic extraction solutions reliably model interconnect parasitics, but design closure is getting more difficult at advanced nodes because of new process effects and variability that affect functionality, performance, and reliability. Some of the parasitic modeling challenges in advanced silicon processes include:
• Local interconnect
• Multi-bias vias
• Diffusion resistance
• Elevated source/drain
• SOI support
• TSV-based 3D-IC verification.
Rule-based extractors are able to extract multimillion net designs, but they rely on heuristic models and are inherently limited in accuracy—up to 10% error for total capacitance and as much as 15% or more for coupling capacitance at 45nm. These errors can account for 5–10% error in simulation results. Designers have been compensating for these limitations by using larger guard bands for timing and signal integrity, aware that such overdesign tends to negatively affect chip performance and die-size.
Until now, field solvers were restricted to cell characterization or selected net extraction. By the nature of their algorithms, they are not capable of delivering the performance or capacity needed for anything larger than a few geometries or nets. This makes them completely unsuitable for extraction of real designs.
Some field solvers employ statistical methods which increase the tools’ capacity but still are not efficient to use for large designs. They also are problematic for device-level extraction accuracy, because the user has to manually define the device regions to prevent extracting capacitive effects that are already included in the device model. This is an error-prone process.
Although statistical tools can be tuned to a tighter error tolerance, they inherently produce outliers. Specifying a tolerance of 1% means that for a design with one million nets, as many as 3,000 nets will have an error of greater than 3% (Figure 2). There is a significant chance that a critical net is misrepresented. After all, limiting extraction in scope to a few known critical nets may indeed miss important parasitic effects on other nets.
Figure 2: Although statistical tools can be tuned to a tighter error tolerance, they inherently produce outliers.
As a result of this situation, designers have been living with an extraction discontinuity and have had to employ different extraction tools and analytical methods to model the parasitic effects in their designs. The level of abstraction imposed by traditional tools forces designers to build in extra design margins, which cancels out the benefits of moving to a smaller node in the first place. IC designers are demanding extraction solutions that deliver accuracy less than 5% to evaluate the effects of parasitics on circuit timing and functionality, without compromising performance.
A faster, attofarad accurate bit cell design
Memory cell designers must consider different techniques to balance requirements for performance with those for stability and power dissipation. Traditional bit cell configurations have displayed stability problems and increased power consumption at scaled-down dimensions. While building in design margins is a solution, extra operating performance and robustness are still left on the table.
New bit cell transistor configurations and scaling, along with innovative design methods such as bit line balancing, are able to solve the power consumption issue and improve the tolerance to process variation. Another benefit of scaled-down device sizes is the decrease in write delay of the memory cell . With smaller device sizes at advanced process nodes, device capacitance is decreasing linearly as well.
Experimental data of device capacitance contribution shows that at 45 nm, about 35% of parasitic capacitance is caused by device capacitance. At 28 nm, that ratio decreases to 28%. However, parasitic capacitance decreases at a slower rate. This means that simulation results are more dominated by interconnect parasitic. For a single cell, as much as 40–50% of the timing can be attributed to parasitics.
To enable cutting-edge design, designers are demanding attofarad accuracy so that they can predict exactly how their design will perform. Up until now, this degree of accuracy has only been attainable with a reference-level field solver. A problem with that, however, is that to accurately characterize a bit cell, it cannot be extracted in isolation. It must be placed in context with other cells to simulate the close interactions of devices and interconnect in an actual design.
This has been a painful process for designers striving to get “the right answer.” One option has been to extract a single cell with a reference-level field solver. But this misses the parasitic effects of other bit cells on the cell, and extracting more than one bit cell at a time is painfully slow if not impossible with these tools. Another option is to manually place the cells in small arrays and extract the parasitics. However, it is very time consuming to build and model the various configurations.
One solution is to model bit cell configurations with a field solver by specifying boundary conditions. This allows the designer to extract a single bit cell accurately by virtually placing rotated and/or mirrored copies of the cell in an array around it. Employing a fast field solver that delivers attofarad accuracy and supports these modeling techniques enables designers to radically speed up their characterization process and realize a design that performs to specification. For example, a bit cell was extracted by Calibre xACT 3D, a fast field solver, in 4 seconds, whereas the reference-level solver took 4.5 hours (Table 1). The total capacitance of the nets in the bit cell extracted by Calibre xACT 3D differed by less than 1% from the golden reference results (Table 2).
A high-performance memory design approach
Memory designers must use different techniques to improve speed, power, and stability. These mainly focus on minimizing latency. Latency is largely related to access time; that is, the amount of time it takes to complete a memory write and read operation. Word line and bit line delays contribute significantly to longer access times, which directly affect performance.
Papers have shown the important impact of parasitic variations such as metal width and thickness on interconnect metal, as well as cell to bit line parasitics .
The design of the control logic to generate signals such as the system clock is also critical for fast read and write operation. Many tradeoffs need to be made to minimize process variations, but it is essential that the design is extracted and analyzed with as much accuracy as possible to reap the benefits of advanced process technology and avoid overconstraining the design.
Because memories are getting increasingly large and occupying much of an SoC’s real estate, inaccurate characterization of the memory building blocks can lead to compounded problems in the full chip. A problem in the bit cell can become magnified at the chip level. Missed parasitic effects in these critical paths can result in severe memory operation failures.
Timing analysis simulation is performed at different phases of the design cycle. Spice-accurate tools deliver the most accurate estimation of the actual performance of a memory design. This is a highly iterative process because accurate parasitics need to be extracted at all stages of the design. Simulations of the design’s building blocks detect sensitivities to process variations and their effect on timing and delay. In this way, problems due to parasitics can be found and corrected in time.
Being able to use one tool for bit cell characterization all the way through full-chip extraction is valuable because it uses the same accuracy at all stages and for all blocks of the design. Figure 3 shows that a fast field solver can extract even large memory designs efficiently. The parasitic effects of the close interaction of interconnect and device geometries, as well as device to device and cell to cell, need to be modeled very accurately to guarantee a high performance memory design.
Figure 3: Fast field solver performs well on even large memory designs.
Plug and play—productivity improvement in the design environment
Being able to extract parasitics for an entire memory with field-solver accuracy has long been a desire for memory designers. But tool capacity has been the traditional bottleneck, both on the extraction as well as the simulation side. Full-chip memory netlists including parasitics are extremely large which represents a challenge for Spice-level simulators.
Multiple use models are available to deal with the large amounts of data more effectively, and new tools speed up the verification cycle. Fast field solvers are able to capitalize on the multi-CPU computing environments already in use for physical verification task like DRC and LVS, and they scale well. Different methods are available to reduce the netlist size and the number of elements that the simulator needs to process. The field solver simply must be able to reduce the parasitic components in the netlist such that accuracy is not compromised.
New parasitic reduction algorithms that are efficient yet preserve the accuracy of the netlist are now available in software such as Calibre xACT 3D. The netlists must be in standard formats to fit into existing production design environments and to incorporate this advanced parasitic extraction solution for memories. This is not just a productivity improvement measure. Linking to the LVS connectivity extraction database is an important aspect for accurate parasitic modeling. During LVS, device recognition is performed and layout specific device parameters are extracted.
An integrated 3D extractor can use this data directly and can model interconnect and device area parasitics correctly. Without this native handshaking, it can be difficult to ensure that there is no double counting of parasitics that are accounted in the device model or missing parasitic effects.
Every phase of the memory design flow benefits from field-solver accuracy, from bit cell characterization all the way to the chip-design level. New field-solver technology is available now and can be easily integrated into popular design and simulation environments. Reduction techniques enable faster simulation. Foundries are now delivering production ready rule files used to model advanced process variability effects. Only field solvers can model these effects accurately to enable cutting-edge designs that meet performance and stability requirements without affecting time-to-market.
1. “2010 Update,” International Technology Roadmap for Semiconductors (ITRS), 15 February 2011.
2. Lin, Sheng, Yong-Bin Kim, and Fabrizio Lombardi. “A 32 nm SRAM Design for Low Power and High Stability,” IEEE International MWSCAS 2008, August 10–13, 2008, Knoxville, Tennessee, pp. 422–425.
3. Teene, A., B. Davis, R. Castagnetti, J. Brown, and S. Ramesh. “Impact of Interconnect Process Variations on Memory Performance and Design,” ISQED 2005 Proceedings of the 6th International Symposium on Quality of Electronic Design.
About the author:
Claudia Relyea is a technical marketing engineer at Mentor Graphics Corp. She holds a BS in electrical engineering and has an extensive background in analog design. For the past three years, Relyea has focused on physical verification and parasitic extraction.