Research goals: Modern compute platforms for machine learning are moving from single-core architecture towards heterogeneous multi-core area, which consists of richer datapath blocks, more complex memory hierarchies and more flexible interconnect. One example of such heterogeneous systems for AI application is KU Leuven’s Diana chip, which combines a RISC-V processor, a pure digital accelerator and an analog In-Memory Computing core. Though the hardware becomes more efficient, the increasing degree of design freedom and the complex interaction within multi-core systems make the run-time performance highly stochastic and no longer deterministic. To rapidly explore the system choices under different variation impacts, an uncertainty-aware design space exploration (DSE) framework is crucial to estimate the trade-off on the hardware level.
Gap in the SotA: State-of-the-art DSE frameworks are all based on deterministic cost models and overlook the uncertainties happening within a system, such as PVT variations, memory run-time conflict and diverse workload-dependent dataflow. These unrealistic assumptions create a discrepancy between the model estimation and the real chip performance, which prevents researchers from accurately understanding the impact of uncertainty on the hardware design.
Result: The project firstly evaluates the impact of sparsity uncertainty on the system behavior and performance. By constructing the covariance depedence across different AI accelerator layers and inferences, the sparsity distribution can be accurately extracted through a small sampling set. Based on the developed analytical and stochastical model, the hardware performance and the sparsity uncertainty are linked accurately. Case studies show the performance variation can be up to 50% per layer and 12% per inference.