TinyFormer accepted by IEEE TCAS-I: efficient sparse transformer design and deployment on tiny devices

👏 Paper title: TinyFormer: Efficient Sparse Transformer Design and Deployment on Tiny Devices.
TinyFormer brings transformer models into tiny-device scenarios such as MCU-based embedded and IoT systems. These platforms have severe storage and memory constraints, making it challenging to design and deploy modern transformer architectures directly.
The framework combines SuperNAS for supernet search, SparseNAS for sparse single-path model selection, and SparseEngine for efficient deployment. By co-optimizing architecture, sparsity, and inference execution, TinyFormer enables transformer inference under strict MCU budgets and improves sparse inference speed while preserving accuracy.
TinyFormer is built around the full deployment path rather than only model compression. It searches for architectures that fit tiny devices, selects sparse structures that reduce inference cost, and provides an engine that can actually execute the resulting model efficiently.
This matters because transformers are increasingly useful for sensing and sequence tasks, but their memory and compute demands often exceed what microcontrollers can support. TinyFormer helps bridge that gap by treating model design and hardware-aware deployment as a single problem.