Research goals: Modern compute platforms for machine learning are moving from single-core architecture towards heterogeneous multi-core area, which consists of richer datapath blocks, more complex memory hierarchies and more flexible interconnect. One example of such heterogeneous systems for AI application is KU Leuven’s Diana chip, which combines a RISC-V processor, a pure digital accelerator and an analog In-Memory Computing core. Though the hardware becomes more efficient, the increasing degree of design freedom and the complex interaction within multi-core systems make the run-time performance highly stochastic and no longer deterministic. To rapidly explore the system choices under different variation impacts, an uncertainty-aware design space exploration (DSE) framework is crucial to estimate the trade-off on the hardware level.
Gap in the SotA: State-of-the-art DSE frameworks are all based on deterministic cost models and overlook the uncertainties happening within a system, such as PVT variations, memory run-time conflict and diverse workload-dependent dataflow. These unrealistic assumptions create discrepancy between the model estimation and real chip performance, which prevents researchers from accurately understanding uncertainty impact on the hardware.
Result: The project firstly start to evaluate the sparsity uncertainty imapct on the system behavior and performance. Through constructing the covariance depedence across different layers and inferences, the sparsity distribution is accurately extracted through a small sampling set. Based on the developed analytical and stochastical model, the connection between the hardware performance and the sparsity uncertainty is accurately linked. Case studies show the performance variation can be up to 50% per layer and 12% per inference.