General matrix multiplication (GEMM) is universal in various applications, such as signal processing, machine learning, and computer vision. Conventional GEMM hardware architectures based on binary computing exhibit low area and energy efficiency as they scale due to the spatial nature of number representation and computing. Unary computing, on the other hand, can be performed by extremely simple processing units, often just by a single logic gate, but currently there exists no efficient design of unary GEMM architecture.
In this paper, we present area- and energy-efficient unary GEMM architecture design called uGEMM enabled by novel arithmetic units. The proposed design relaxes previously imposed constraints on input bit streams—low correlation or long stream length—and achieves superior area and energy efficiency over existing unary systems. Furthermore, the output bit streams exhibit higher accuracy and fast convergence, which facilitate energy-accuracy scaling on resource-constrained systems.