RTAS 2021 paper on memory-efficient graph neural network execution for edge platforms

👏 Paper title: Optimizing Memory Efficiency of Graph Neural Networks on Edge Computing Platforms.

Graph neural networks are attractive for edge intelligence, but their feature tensors and neighborhood aggregation patterns can exceed the limited memory budget of embedded and edge platforms. This paper focuses on reducing the peak memory footprint of GNN execution so that graph workloads can run more reliably on constrained devices.

The proposed feature decomposition method divides feature processing into smaller, manageable pieces while preserving the semantics of GNN computation. By lowering transient memory pressure during inference, the technique enables resource-limited platforms to execute larger graph models or larger graph inputs without relying on expensive memory expansion.

The key insight is that memory efficiency can be improved without changing the high-level GNN task. By decomposing feature computation, the system avoids materializing large intermediate tensors all at once and can schedule memory use more carefully.

This is particularly relevant for edge platforms, where memory capacity is often a harder constraint than raw compute. The work gives GNN deployment a more practical path on embedded and mobile devices that cannot simply scale memory with model size.