Deep learning models power advances across computer vision, natural language processing, and other domains, but their significant computational and energy demands create deployment challenges. Cloud-based solutions consume substantial energy and raise privacy concerns, while edge deployment faces hardware constraints, including limited resources and strict power budgets. This dissertation addresses these challenges through comprehensive hardware-software co-design optimization. The work spans hardware performance modeling to functional chip implementation, developing adaptable and energy-efficient architectures for deep learning acceleration. The dissertation presents hardware performance assessment frameworks, sparse neural processors, and specialized accelerators for emerging AI workloads, including transformers and large language models. These optimization techniques enable sophisticated AI capabilities on resource-constrained platforms for seamless integration into everyday environments.
19/6/2025 15:00 - 17:00
ESAT Aula C, B91.300