An Advanced Hardware Accelerator for Gradient Descent for Deep-learning

An advanced scalable hardware accelerator for mini batch gradient descent, targets deep-learning applications.

Deep neural networks are being widely used in a large number of applications for analyzing and extracting useful information from large amount of data that is being generated every day. Inference and training are the two modes of operation of a neural network. Training is the most computationally challenging task as it involves solving a large-scale optimization problem, typically using the back-propagation algorithm (backprop) which is based on gradient descent. For fast convergence large batch sizes are desirable, but large batch sizes are extremely inefficient on a CPU or GPU because of poor utilization of caches and synchronization (also power-wise).

This project suggests building a designated acceleration IP, which efficiently performs RAM-to-RAM calculations in a pipeline fashion and thereby dramatically offloads machine-learning software applications. The IP is well defined in paper [3], provides APB interface [2] for external CPU access (configurations, results, etc.).


  1. Mini-batch gradient descent –
  2. AMBA APB –
  3. Rasoori, Sandeep, and Venkatesh Akella. “Scalable Hardware Accelerator for Mini-Batch Gradient Descent.” Proceedings of the 2018 on Great Lakes Symposium on VLSI. ACM, 2018.‏

Previous knowledge required

  • 044262 – Logic Design

For more information see :