How best to reduce power on future ICs?

by R. Colin Johnson , TechOnline India - February 23, 2012

Here are the top five ways to reduce power on future ICs. They are already in development, and collectively they hold the promise of solving the problem for good within the decade.

Excessive power consumption has become the chief roadblock to further scaling of semiconductors, threatening to stall advancement in all electronics sectors—everything from further miniaturizing mobile devices to revving supercomputers.

While the causes are rooted in the immutable laws of physics and chemistry, engineers have devised a novel set of innovations that are mitigating the problem today and that promise to reinvigorate the chip industry tomorrow.

Here are the top five ways to reduce power on future ICs. They are already in development, and collectively they hold the promise of solving the problem for good within the decade.

Embrace co-design

Electronic design automation tools can optimize for low power by enabling teams to co-design for it from the very beginning. In fact, the developers of lowest-power processors and systems-on-chip in the industry achieved their advantage not only by optimizing architectures and materials, but also by

co-designing packaging, power sources, RF circuitry and software to minimize power without diminishing performance or inflating cost.

"Building low power requires a holistic approach across technology, design methodology, chip architecture and software," said David Greenhill, director of design technology and EDA at Texas Instruments (Dallas).

TI has set the bar for low-power devices by optimizing each subsystem using pioneering techniques, such as building its own process technologies to balance off-mode leakage with active-current performance, or using voltage and frequency scaling to define a variety of power-saving operating modes.

"The first step is knowing the goal of the product from a performance and power perspective. Once those goals are determined, the process can be designed to provide the required performance without exceeding the device's power budget," said Randy Hollingsworth, 28-nanometer platform manager at TI.

EDA tools have been key to consistently achieving these lower-power goals, but sometimes they require a few iterations around the design loop, since estimates of power consumption with conventional EDA tools are only accurate near the end of the design cycle. For future ICs, power consumption estimates need to be accurate as early as possible in the design cycle.

Providers of a few specialized tools have picked up that baton. Atrenta Inc. (San Jose, Calif.), for instance, makes a tool called Spyglass Power that performs power consumption estimation, reduction and verification using the standard register-transfer level (RTL) descriptions that are available from every major EDA tool very early in the design cycle.

"Today, engineers want to estimate power very early in the design process," said Peter Suaris, Atrenta's senior director of engineering. "You can no longer wait until the end of the design cycle to estimate power consumption; you need to co-design for power at the RTL level, and make changes in your design to conserve power right from the beginning."

Atrenta reckons that its specialized power conservation tools can estimate the final power budget within 20 percent, while its power reduction tools can shave up to 50 percent off the energy consumed by the final design.



 Atrenta's tool can estimate power consumption very early, here pinpointing potential hot spots before the beginning of the design cycle. (Source: Atrenta)

Lower the operating voltage

Scaling chips to smaller size has traditionally enabled power savings by lowering the operating voltage too. For instance, Samsung says its latest 20-nm "green memory" chips achieve a 67 percent power savings by reducing their operating voltage from 1.5 volts to 1.35 V.

Processor and logic circuitry can be run at even lower voltages than memory, but reductions of their operating voltages below 1 V require hard-to-come-by improvements in the semiconductor processes themselves. IBM, Intel, Samsung, TI, TSMC and every other semiconductor manufacturer is constantly improving its processes to operate at lower voltages, but progress has slowed over the past few generations.

The main sticking point is that the threshold voltage at which transistors turn on is becoming inconsistent from wafer to wafer because of process variations that were insignificant at larger scales. And since off-state leakage current for a given voltage varies wildly at different thresholds, the ideal chip would actually use a custom supply voltage fine-tuned to its vagaries.

Intel claims to have a better solution—one that it has spent almost a decade perfecting. Intel's 3-D FinFET transistor architecture, which it calls tri-gate, wraps three metal gates around a transistor's channel in 3-D, thus permeating it with the gates' electrical field. The technique nullifies process variations that have blocked operating voltages below 1 V.

"We have already demonstrated that our tri-gate structures can reduce operating voltage into the 0.7-volt range, and we can still go lower," said Intel senior fellow Mark Bohr. "These are fully depleted transistors with a steeper subthreshold slope, allowing them to shut off faster with less leakage and to turn on at a lower threshold voltage."

Semiconductor makers with deep pockets are looking to emulate Intel's 3-D architecture, but a few startups are working on new types of planar processes aimed at restarting voltage scaling for semiconductor makers without the time and money to perfect 3-D. SuVolta Inc. (Los Gatos, Calif.), for instance, has invented an ultralow-voltage planar process for standard CMOS lines.

Instead of using 3-D gates to deplete the transistors, SuVolta uses an undoped channel (with doped threshold and guard bands) that sidesteps variations in doping. The deeply depleted channel process can be implemented on standard planar CMOS lines.

"By using our planar deeply depleted channel process, we have demonstrated that supply voltages can be reduced to 0.6 volt today and even lower tomorrow," said Scott Thompson, chief technology officer at SuVolta.

The company's first licensee is Fujitsu Semiconductor (Tokyo), which will go into mass production later this year. Further announcements of major licensing deals are promised later in 2012.



By going to an undoped transistor channel (center, white, above a lightly doped threshold region, light green, and heavily doped screening region, dark green), SuVolta's planar CMOS process promises to put semiconductor voltage scaling back on track after years of stagnation. (Source: SuVolta)

Scale performance

In general terms, the lower the supply voltage and clock speed, the lower the power consumption. Performance, however, can suffer too. As a result, the latest microcontrollers and SoCs have resorted to smart power management units, which adapt the operating voltage and clock speed to match the workload.

"The basic idea of power management is to scale the supply voltage and clock speeds of different parts of a chip separately, to match their workload at any given time and to turn off circuitry that is not being used," said Tyson Tuttle, CTO at Silicon Laboratories Inc. (Austin, Texas).

Power management units are usually implemented as state-machine blocks that can selectively lower both voltage and clock speed for noncritical functions. But as more transistors are crammed onto chips at advanced semiconductor nodes, the concept of "dark silicon," wherein most of a chip is powered down until needed, may be the harbinger of future semiconductors.

"At advanced nodes beyond, say, 22 nanometers, SoCs are going to have many more transistors that can be turned on at the same time," said Ely Tsern, chief technologist at Rambus Inc. (Sunnyvale, Calif.) "The concept of dark silicon is to create many special-purpose functions on a chip but only turn on the ones that are needed at any one time, with the rest staying dark, doing nothing."

Intel is leading the way in on-chip power management by carefully monitoring the temperature of its cores at all times, allowing both overclocking (turbo mode) to boost performance and underclocking to save power.

But not all power management functions can be economically moved onto a chip. In fact, the most intelligent power management schemes split the task between on-chip and external power management units. "There is always going to be a need for external power management, because what you can bring onto the chip is limited in terms of power density," said Ashraf Lotfi, CTO and co-founder of Enpirion Inc. (Hampton, N.J.)

Enpirion specializes in producing free-standing power management units, which can accept commands from a processor to lower its voltage as it enters sleep mode, for instance, then ramp it back up when it awakes.



 Intel's turbo mode overclocks cores for bursts of speed during heavy workloads, then monitors their temperature and slows them down when they begin to overheat. (Source: Intel )



Adopt 3-D/optical interconnect

Shortening the length and lowering the resistance of interconnection lines can vastly reduce the power consumption on ICs by enabling smaller driver transistors. The traditional way to shorten them is to add layers of metallization, with the result that some chips today have as many as 10 metal layers.

The latest innovation in interconnection layering, however, is the 3-D through-silicon-via (TSV), which allows memory chips to be stacked on top of processors. The technique reduces the length of interconnects to the distance between chips, rather than requiring power-hungry driver transistors and long printed-circuit board interconnection lines. The economic impact of TSVs, however, has been daunting, prompting most chip makers to delay their implementation.

"While it is true that TSVs reduce power by shortening the wire length, they are a very costly solution," said TI's Greenhill. "To become economical, TSVs need to be an enabler for other gaps, such as interface performance, to justify their cost."

One company intimately familiar with TSVs' cost/performance trade-off is Xilinx Inc. (San Jose, Calif.), which is delivering the first commercial chips to use TSVs. The company's cost-effective approach not only lowers power but also boosts performance compared with soldering separate devices on pc boards. It also lowers the BOM costs for Xilinx's customers, said Ephrem Wu, senior director of FPGAs at Xilinx.

Xilinx sidesteps soldering separate FPGAs onto boards by using a silicon interposer that can interconnect four high-density FPGAs inside a single package.

The approach boosts performance while lowering power to 19 watts, compared with 112 W for a conventional pc board solution. Another bleeding-edge technique is to use optical transceivers. IBM Corp.'s Power7 supercomputer, for instance, uses on-board photonic interconnects made from conventional optical components. Future ICs will likely use specialized optical solutions from Kotura (Monterey Park, Calif.) and others that have transferred photonic functions onto tiny optical chips that can be bonded to processors and memory chips.

"Our low-power silicon-germanium devices integrate the lenses, filters, modulators and all the other optical components you need onto a single chip," said Arlon Martin, vice president of marketing at the company.

Kotura's silicon photonics process allows it to integrate the optical transceivers from a cigarette-pack-sized, $10,000 conventional unit into a streamlined, iPhone-sized $500 package that uses four to 20 times less power. Kotura has also demonstrated that its SiGe transceivers can send optical signals through the air between stacked CMOS dice, essentially creating a high-speed, low-power optical data channel between stacked chips in lieu of pc board traces.


Xilinx uses a silicon interposer by TSMC to interconnect four of its FPGAs inside the package, thereby lowering the power from 112 watts to 19 W. (Source: Xilinx)

Try new materials

Going to higher-mobility materials will also reduce power. Magnetic materials are already being added to standard CMOS lines, and "miracle" materials like carbon nanotubes and graphene are on the horizon.

TI added magnetic materials to its CMOS lines in order to manufacture embedded microcontrollers with ferroelectric RAM. Licensed from Ramtron International Corp. (Colorado Springs, Colo.), FRAMs are more convenient than flash memories, since they are nonvolatile but also random access.

"Our nonvolatile FRAMs are more efficient to read or write in terms of energy consumption compared with flash," said Baher Haroun, CTO of the wireless business unit at TI.

Enpirion, too, has introduced magnetic materials into its CMOS line, with which it plans to start manufacturing integrated inductors and transformers on its power management chips in 2012. Today, inductors and transformers cannot be economically integrated onto chips that have to operate at high frequencies, but Enpirion's proprietary magnetic material aims to solve that problem.

"We have combined different metal alloys together to allow our magnetic material to operate at very high frequencies while remaining energy efficient," said Enpirion's Lotfi.

Meanwhile, the Semiconductor Research Corp. (Research Triangle, N.C.) recently funded research at IBM and Columbia University to integrate inductors onto processors. The company claims that it will allow on-chip regulation to throttle supply voltage on a nanosecond timescale, enabling workload matching that cuts energy consumption by up to 20 percent.

Other near-term materials to be added to CMOS lines in the near future include indium gallium arsenide. Intel plans to use InGaAs to supercharge the channel on future tri-gate transistors, a move it claims will allow operating voltages to drop as low as 0.5 V.

In the long term, however, carbon nanotubes and the planar version, graphene, are likely to become the materials of choice for ultralow-power future devices.

Graphene interconnects have outperformed copper in the lab at Georgia Tech. IBM has demonstrated that low-power, ultrahigh-speed transistors can be fabricated using either carbon nanotubes or graphene. And TI recently demonstrated that graphene can be fabricated at the wafer scale.

Intel, for its part, has investigated the use of carbon-based materials for their higher electron mobility but has concluded that they are not yet ready for prime time.

"Carbon-based interconnects, using nanotubes or graphene, have very intriguing properties," said Intel's Bohr. "But while the bulk material has lower resistance, contacting it does not have low resistance. However, it is a very promising material, so I expect we will see much more research on it in the coming years."



Enpirion's on-chip inductors manufactured on silicon wafers using a proprietary manufacturing process and a unique magnetic alloy formula. 




Side bar: Smarter power management schemes

As a processor manufacturer that also makes its own external power management ICs, Freescale Semiconductor Inc. has the luxury of being able to optimize the partitioning of power management tasks between internal and external units.

"There's a lot of secret sauce involved in finely tuning power management functions for a particular application processor," said Rajeev Kumar, marketing manager for i.MX processors. "To reduce the complexity and cost of our external power management ICs, we have sucked into our latest processors the functions of dynamic voltage and frequency scaling, which matches power consumption to what an application is doing at any one time."

Freescale's i.MX-6, for instance, not only does voltage conversions on-chip to reduce the number of external power supply voltages needed, but also dynamically matches the frequency of up to four ARM9 cores by adjusting their supply voltage between 0.9 and 1.2 volts, or even by shutting down unused cores altogether. The on-chip power management functions also eliminate the need for external components to handle peripheral circuits and hardware accelerators, by turning them on and off as needed under program control.

"By adding more on-chip power management functions, we have been able to reduce the need for external controllers down to a single power-management IC," said Michael Jennings, manager of power management ICs at Freescale. "And even though different applications have very different power requirements, our i.MX-6 reference designs show engineers how to significantly reduce their bill of materials to a single configurable and programmable power management IC.

"By scaling the power output as a single high-current voltage for single cores, or as up to four lower-current supplies for our dual- and quad-core processors, separate programmable voltages can be supplied to each core, while an array of switches and built-in regulators supply all the different voltages and currents needed for peripherals."

For Freescale's recently announced QorIQ dual-threaded e6500 Power Architecture Advanced Multiprocessing (AMP) processors with 24 virtual cores, a cascading power manager enables a fourfold performance increase, without a meltdown, by never running all cores at full throttle. The processor's automatic turbo mode selectively runs the cores doing the most critical tasks at higher frequencies; the cascading power manager ensures the power budget is not exceeded by throttling back the cores doing less critical tasks.

"Cascading power management is a workload balancing technique" that's policy driven, said John Dixon, marketing lead for QorIQ AMP processors. "Designers pick the policies that appropriately mitigate the increase in energy consumption per their specific end equipment needs."

To support the formation of appropriate policies for power management, Freescale has invented a "drowsy" mode whereby cores can be powered down for energy conservation, while keeping their registers and local memory values intact so they can be quickly powered back up when needed.

"At lighter workloads, it is not efficient to allocate tasks to all cores equally," said Freescale senior SoC architect Ben Eckermann. "Instead, we allocate work to fewer cores and let the other cores enter drowsy mode using an autonomous self-balancing dynamic allocation without burdening the programmer with complicated allocation and monitoring tasks."


Article Courtesy: EE Times

About Author


blog comments powered by Disqus