Design and Implementation of a Hardware Accelerator for Deep Convolutional Auto-Encoder

An advanced scalable hardware accelerator for deep Convolutional Auto-Encoder (CAE), targets deep-learning applications. Integrating a CAE hardware accelerator has advantages in resources occupation, operation speed, and power consumption, indicating great potential for application in digital signal processing.
This project suggests building a designated acceleration IP, which efficiently performs RAM-to-RAM calculations in a pipeline fashion and thereby dramatically offloads machine-learning software applications.

Deep neural networks are being widely used in a large number of applications for analyzing and extracting useful information from large amount of data that is being generated every day.

An autoencoder is a type of artificial neural network used to learn efficient data codings in an unsupervised manner. The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”.

“Autoencoding” is a data compression algorithm where the compression and decompression functions are data-specific, lossy, and learned automatically from examples rather than engineered by a human. Additionally, in almost all contexts where the term “autoencoder” is used, the compression and decompression functions are implemented with neural networks.

When it comes to images (as inputs), it makes sense to use convolutional neural networks (convnets) as encoders and decoders. In practical settings, autoencoders applied to images are always convolutional autoencoders, as they simply perform much better.

This project suggests building a designated acceleration IP, which efficiently performs RAM-to-RAM calculations in a pipeline fashion and thereby dramatically offloads machine-learning software applications. The IP is inspired by a state-of-the-art paper [4], which proposes a novel hardware implementation of a convolutional auto-encoder (CAE) in order to simplify the hardware design and reduce the resource requirements. In addition, the IP provides APB interface [3] for external CPU access (configurations, results, etc.).

References

Auto-Encoder – https://en.wikipedia.org/wiki/Autoencoder
Building AutoEncoders – https://blog.keras.io/building-autoencoders-in-keras.html
AMBA APB – http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ihi0024c/index.html
An FPGA Implementation of a Convolutional Auto-Encoder – https://www.mdpi.com/2076-3417/8/4/504

Previous knowledge required

044262 – Logic Design

Design goals and challenges

Learning the basics of Verilog RTL coding language (commonly used in the industry).
Learning the basics of communication protocols, hereby AMBA APB/AXI.
Learning common Machine-Learning standards which are commonly used in the industry.
Practice in coding design using arch. spec., ramping up an advanced accelerator as an IP.