Hardware-algorithm Co-design and Accelerator Architecture Exploration for hybrid DNN and DSP Workloads

Jun Yin , Marian Verhelst

Hardware-efficient AI and ML

Research Goal: Hybrid algorithm structures with proper digital signal processing (DSP) pre-/post-processing in combination with light-weight deep neural network (DNN) models have proved to be beneficial in many application domains. Such hybrid ML models typically give up model regularity for a reduced memory/computational footprint. To facilitate the hardware-algorithm co-design of these hybrid DSP/DNN algorithms targetting resource-contrained embedded platforms, early-design-stage modeling and design space exploration is necessary to find optimal hardware architectures for efficient deployment.

This research is being carried out in a joint project with Bosch within the EU Marie-Curie Project I-SPOT, within the application domain of automotive acoustic perception.

Gap in SotA: Although many dedicated homogeneous or heterogeneous accelerator systems have been proposed in the adjacent domains of natural language processing and indoor acoustic reasoning, there is little research on both hardware and algorithmic solutions for outdoor automotive scenarios. Therefore, rapid prototyping and modeling are required to enable iterative optimizations across the fast-paced algorithmic design and hardware domain. There exist several domain-specific toolchains that target rapid reconfigurable hardware generation and for-loop-based dataflow optimization. Yet, the end-to-end support of these tools for hybrid DNN-DSP workloads still requires heavy manual tweaking.

Progress Updates: The first stage of research focused on exploring the typical complexity and workflow diagrams of outdoor acoustic applications in a hybrid DSP+DNN manner. Hence, we set off from a CNN-based model using SRP-PHAT features to perform robust sound source localization in noisy and reverberent environments. This allowed to jointly evaluate the algorithm accuracy and hardware overhead in a DSP-DNN combined design space. Over multiple system parameter cases, we compressed the hybrid algorithm to save 10.32∼73.71% computational complexity and 59.77∼94.66%
DNN weights from the baseline, while still retaining competitiveness in state-of-the-art accuracy comparisons.

For the next step, we are integrating finer-grained algorithmic scheduling into the workflow, together with hardware overhead estimation in order to allow searching for the optimal accelerator architecture configuration for DSP/DNN systems.

Get in touch

Jun Yin

Phd student

Marian Verhelst

Academic staff

Publications about this research topic

Jun Yin and Marian Verhelst; "CNN-based Robust Sound Source Localization with SRP-PHAT for the Extreme Edge"; Accepted to Transactions on Embedded Computing Systems.

Discover more publications