Optimized deployment of AI algorithms on rapidly-changing heterogeneous multi-core compute platforms

Josse Van Delm , Marian Verhelst Ultra-low power digital SoCs and memories Hardware-efficient AI and ML

As the demand for intelligent devices continues to rise, the need for specialized programmable hardware for machine learning becomes increasingly apparent. However, the highly-specific nature of this hardware also poses a significant challenge in terms of programming and deployment. To address this issue, multi-level machine learning (ML) compilers such as Apache TVM and MLIR-based compilation flows have emerged as effective solutions, capable of automatically generating optimized ML compute kernels for a wide range of hardware, including GPUs, CPUs, and custom NPUs.

Despite their efficacy, the process of porting these compilers to new hardware remains a complex and time-consuming task, requiring extensive knowledge of both the target hardware and software algorithms. This research aims to leverage multi-level compiler technology, such as MLIR, to enhance the integration of hardware and software development flows, enabling a tighter coupling of compilers and associated ML compute hardware.

The ultimate goal of this research is to improve the utilization of new compute hardware, making machine learning more efficient and widely accessible. Through the better integration of hardware and software development flows, we aim to achieve more efficient hardware, more efficient compilers, and ultimately, more efficient ML compute. This has the potential to expand the use of machine learning in a wide range of scenarios.

Get in touch
Josse Van Delm
Phd student
Marian Verhelst
Academic staff

Other research topics in Ultra-low power digital SoCs and memories and Hardware-efficient AI and ML

Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format
Hardware-efficient AI and ML
Man Shi, Arne Symons, Robin Geens, and Chao Fang | Marian Verhelst
Massive parallelism for combinatorial optimisation problems
Hardware-efficient AI and ML
Toon Bettens and Sofie De Weer | Wim Dehaene and Marian Verhelst
Carbon-aware Design Space Exploration for AI Accelerators
Hardware-efficient AI and ML
Jiacong Sun | Georges Gielen and Marian Verhelst
Decoupled Control Flow and Memory Orchestration in the Vortex GPGPU
Hardware-efficient AI and ML
Giuseppe Sarda | Marian Verhelst
Automated Causal CNN Scheduling Optimizer for Real-Time Edge Accelerators
Hardware-efficient AI and ML
Jun Yin | Marian Verhelst
A Scalable Heterogenous Multi-accelerator Platform for AI and ML
Hardware-efficient AI and ML
Ryan Antonio | Marian Verhelst
Uncertainty-Aware Design Space Exploration for AI Accelerators
Hardware-efficient AI and ML
Jiacong Sun | Georges Gielen and Marian Verhelst
Activity-independent variability resilience for complex ultra-low voltage digital ICs
Ultra-low power digital SoCs and memories
Clara Nieto Taladriz Moreno | Wim Dehaene
Integer GEMM Accelerator for SNAX
Hardware-efficient AI and ML
Xiaoling Yi | Marian Verhelst
Improving GPGPU micro architecture for future AI workloads
Hardware-efficient AI and ML
Giuseppe Sarda | Marian Verhelst
SRAM based digital in memory compute macro in 16nm
Hardware-efficient AI and ML
Weijie Jiang | Wim Dehaene
Scalable large array nanopore readouts for proteomics and next-generation sequencing
Analog and power management circuits, Hardware-efficient AI and ML, Biomedical circuits and sensor interfaces
Sander Crols | Filip Tavernier and Marian Verhelst
Design space exploration of in-memory computing DNN accelerators
Hardware-efficient AI and ML
Pouya Houshmand and Jiacong Sun | Marian Verhelst
Multi-core architecture exploration for layer-fused deep learning acceleration
Hardware-efficient AI and ML
Arne Symons | Marian Verhelst
HW-algorithm co-design for Bayesian inference of probabilistic machine learning
Ultra-low power digital SoCs and memories, Hardware-efficient AI and ML
Shirui Zhao | Marian Verhelst
Design space exploration for machine learning acceleration
Hardware-efficient AI and ML
Arne Symons | Marian Verhelst
Enabling Fast Exploration of the Depth-first Scheduling Space for DNN Accelerators
Hardware-efficient AI and ML
Arne Symons | Marian Verhelst
Automated in-situ monitoring for variability resilient and energy efficient digital circuits
Ultra-low power digital SoCs and memories
Clara Nieto Taladriz Moreno | Wim Dehaene
High-throughput high-efficiency SRAM for neural networks
Ultra-low power digital SoCs and memories, Hardware-efficient AI and ML
Wim Dehaene and Marian Verhelst
Ultrasound wave based body area networks
Analog and power management circuits, Ultra-low power digital SoCs and memories, Biomedical circuits and sensor interfaces
Wim Dehaene and Marian Verhelst
Heterogeneous Multi-core System-on-Chips for Ultra Low Power Machine Learning Application at the Edge
Hardware-efficient AI and ML
Pouya Houshmand, Giuseppe Sarda, and Ryan Antonio | Marian Verhelst
SRAM Macro Design
Ultra-low power digital SoCs and memories
Bob Vanhoof | Wim Dehaene

Want to work with us?

Get in touch or discover the way we can collaborate.