Ultra-low power digital SoCs and memories

Further evolution of applications in robotics, autonomous vehicles, biomedical wearables and so on rely on ultra low energy consumption of the electronics circuits they encompass. Ever increasing leakage and technological variability are the critical phenomena. Also, the way the design abstraction layers needed to deal with ever growing complexity, are structured causes energy overhead. Thus designers are forced to rethink their design strategies. The classical strategy of adding margins to the design to keep the design targets within the required windows leads to prohibitive oversizing, with all the extra energy and area cost that comes with it. Over the last decade MICAS has been exploring design strategies for digital processors and memories to deal with new technology properties on the one hand and the ever more demanding application requirements on the other hand.

icon

Research challenges

Error detection and correction (EDAC) based timing for digital circuits

Timing closure as of today is based on (statistic) static timing analysis. Basically, this means that the delay of the critical path is required to stay below the clock period minus sequencing overhead in the worst case. When all timing derating – process, supply voltage, temperature, aging, local variability - is added up this leads to oversized designs with huge amounts of hold buffers and a power supply voltage that is overrated. This can partially be mitigated by adding replica timing detectors, driving a voltage and frequency adaption system, to the circuit. Replica’s, however, are ineffective for time varying or intra-die variability. In-situ error detection and correction can take the effectivity of voltage and frequency scaling much further as our recent publications have shown. In MICAS we investigate several techniques for EDAC. We focus on both the effectivity of the techniques. For this, different kinds of timing detection are considered: e.g., end point based or activity-based detection. A second focus is the automatic insertion of EDAC circuits. The goal is to make EDAC insertion a fully automated step in the design flow.

Overview of completion detection timing error detection: a late signal going from D0 to D2 causes a detected error

Advanced SRAM architectures

In a lot of modern systems the energy consumption of the memories it the major glutton. This is further aggravated by technological leakage and variability. This instigates the quest for ever more efficient memory circuits. In MICAS we focus on SRAM. We conceive advanced memory matrix architectures accompanied by ultra-low leakage periphery. Contrary to popular believe this memory matrixes are not efficient on the lowest voltage as SRAMs are leakage dominated circuits. This quest will continue in the coming years: pushing down the leakage with circuit optimizations that mitigate variability.

In memory computing for machine learning

Machine learning becomes more ubiquitous every day. Neural net-based algorithms run in server farms as well as on smart phones. They can even be found in the most advanced hearing aids. No wonder that also the cry for energy efficiency sounds ever louder. To reduce energy in ML, a closer marriage between calculating logic and local storage is an attractive option. However, the tradeoff between area, energy and performance for these emerging in memory compute systems is, and will be for a long time, subject to research. Are we going for digital or analog in memory compute? How does this question map on a precision axis? Higher accuracies will require digital precision but how much memory do we actually implement? The answer to all these research questions is dependent on the neural network and thus the application at hand. Furthermore, not only the inference of the neural networks must be considered also real time learning will enter the game. Advanced – MICAS – circuit research will have to provide the answers.

Processor and co-processor design for ULP systems

Embedded systems need a high degree of programmability, while keeping their energy footprint down. This calls for embedded processors, customized towards specific classes of workloads. MICAS is both active in extending RISC-V cores with custom accelerators, as well as adapting processors and their periphery towards ULP operation. Also, unconventional compute architectures, such as coarse grain reconfigurable arrays and other non-Von-Neuman structures are part of the exploration landscape.

Current research topics

Activity-independent variability resilience for complex ultra-low voltage digital ICs
Ultra-low power digital SoCs and memories
Clara Nieto Taladriz Moreno | Wim Dehaene
HW-algorithm co-design for Bayesian inference of probabilistic machine learning
Ultra-low power digital SoCs and memories, Hardware-efficient AI and ML
Shirui Zhao | Marian Verhelst
Automated in-situ monitoring for variability resilient and energy efficient digital circuits
Ultra-low power digital SoCs and memories
Clara Nieto Taladriz Moreno | Wim Dehaene
Optimized deployment of AI algorithms on rapidly-changing heterogeneous multi-core compute platforms
Ultra-low power digital SoCs and memories, Hardware-efficient AI and ML
Josse Van Delm | Marian Verhelst
High-throughput high-efficiency SRAM for neural networks
Ultra-low power digital SoCs and memories, Hardware-efficient AI and ML
Wim Dehaene and Marian Verhelst
Ultrasound wave based body area networks
Analog and power management circuits, Ultra-low power digital SoCs and memories, Biomedical circuits and sensor interfaces
Wim Dehaene and Marian Verhelst
SRAM Macro Design
Ultra-low power digital SoCs and memories
Bob Vanhoof | Wim Dehaene

Innovative chips

BRUTUS
Technology: 22nm FD-SOI
Published: Transactions on Circuits and Systems I: Regular papers
Application: Ultra low leakage power SRAM in a RISC-V microcontroller
CAESAR
Technology: 22nm FD-SOI
Published: Solid State Circuits Letters
Application: Ultra low leakage SRAM using replica monitoring
Low power variability resilient and energy efficient design
Technology: 22nm CMOS
Published: Ieee Transactions On Very Large Scale Integration (Vlsi) Systems
Application: Low-power digital circuits
Wide supply range 0.17PJ operation multiply-accumulate unit
Technology: 28nm UTBB FD-SOI
Published: ASSCC’14
Application: Low-power microcontroller for e.g. the internet-of-things

Top publications

  1. Rooseleer B., e.a., “A 65 nm, 850 MHz, 256 kbit, 4.3 pJ/access, Ultra Low Leakage Power Memory Using Dynamic Cell Stability and a Dual Swing Data Link.”, IEEE journal of solid state circuits, vol.44, no. 7, 1784-1796.
  2. Reyserhove H., Dehaene W., “Margin elimination through timing error detection in a near-threshold enabled 32-bit microcontroller in 40 nm CMOS.”, IEEE journal of solid state circuits, vol. 53., no. 7, 2018, pp. 2101-2113
  3. Reynders N., Dehaene W., “Variation-Resilient Building Blocks for Ultra-Low-Energy Sub-Threshold Design”,IEEE Transactions on Circuits and Systems II, vol. 59, no. 12, 2012, pp. 898-902.
  4. Uytterhoeven R, “Design Margin Reduction Through Completion Detection in a 28-nm Near-Threshold DSP Processor”, IEEE journal of solid state circuits, 2021
Get in touch with our lead researchers

Interested in working together?

Wim Dehaene
Wim Dehaene
Academic staff