Clustering for unsupervised learning is an common task in machine learning systems. Several algorithms can be used for this task, for example KMeans. The main problem with Kmeans algorithm is the huge amount of computations. Minibatch Kmeans proposes an effective technique to drastically reduce the number of computations with an insignificant impact on the quality of the results. The goal of this project is to design and implement a hardware...
Machine Learning


The DNA Sequencing process involves passing a strand of DNA through the nanopore which causes drops in the electric current passing between the walls of the pore. The amount of change in the current depends on the type of base passing through the pore. This signal is then sampled. In this project, we will design a standalone accelerator for the 3rd generation DNA sequence basecalling for personalized medicine applications.

In this project, the students will implement a design of controller and architecture for solidstate drive that uses AI techniues to improve performance. The implementation includes system matlab modeling, spec and architecture definition, logic design using the Verilog HDL, verification and synthesis.

Sparse linear algebra is a frequent bottleneck in machine learning and data mining workloads. The efficient acceleration of sparse matrix calculations becomes even more critical when applied to big data problems. The goal is to implement an accelerator for multiplying a sparse matrix with a sparse vector. Current solutions fetch from memory all nonzero elements of the sparse matrix. The aim of this project is to implement a technique in...

An advanced scalable hardware accelerator for deep Convolutional AutoEncoder (CAE), targets deeplearning applications. Integrating a CAE hardware accelerator has advantages in resources occupation, operation speed, and power consumption, indicating great potential for application in digital signal processing. This project suggests building a designated acceleration IP, which efficiently performs RAMtoRAM calculations in a pipeline fashion and thereby dramatically offloads machinelearning software applications.

In this project, theories of the cellular nonlinear network will be studied and the possibilities of using memristive devices in these networks will be investigated. A software model of prototype cellular nonlinear neural network accounting for the behaviors of memristive devices as the synaptic connections will be implemented and a series of simulations will be performed.

In this project, you are required to design a systolic array that efficiently implements the logic required to support perchannel activation tensor quantization for a convolution neural network. You are required to implement the design using SystemVerilog, simulate and synthesize it after which the layout will be designed. Area, power, and energy will be analyzed and compared to a conventional systolic array. Skills you will acquire: SystemVerilog, Synopsys Design Compiler,...

Ferroelectric Field Effect Transistor (FeFET) memory has shown the potential to meet the requirements of the growing need for fast, dense, low power and nonvolatile memories.Integrating a layer of ferroelectric within the gate stack of a regular Field Effect Transistor (FET) enables the transistor to store data in the polarization state of the ferroelectric. In this project, we look for appropriate application of binary neural network (BNN) which can benefit...

This project proposes building a designated accelerator, which efficiently performs RAMtoRAM calculations in hardware in a pipeline fashion and thereby dramatically reducing CPU load for machinelearning software applications.

A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers between the input and output layers. The DNN finds the correct mathematical manipulation to turn the input into the output, whether it be a linear relationship or a nonlinear relationship. The goal of this project is to build a novel DNN accelerator with simultaneous multithreading.

This project proposes building a designated accelerator, which efficiently performs RAMtoRAM calculations in hardware in a pipeline fashion and thereby dramatically reducing CPU load for machinelearning software applications.

In this project, we will design a standalone accelerator for the 3rd generation DNA sequence basecalling for personalized medicine applications.

Neural networks is a rapidly emerging field. The goal of this project is to perform placement of standard cells in VLSI circuits with neural networks as described in the paper "Neural Network Based Approach to cell Placement" which uses Artificial Neural Network techniques in order to do the cell placement.

A systolyic array is an homogenous array of identical processors each performing the same function and each connected to several neighbours. Such a structure is very suitable for fast and efficient implementation of machine learning algorithms. The goal of this project is to design and implement an architecture for the computation of the convolution stage of a neural network for deep learning.

With the recent advance of wearable devices and Internet of Things (IoTs), it becomes attractive to implement the Deep Convolutional Neural Networks (DCNNs) in embedded and portable systems. Currently, executing the softwarebased DCNNs requires highperformance and highpower servers. Stochastic Computing (SC), which uses a bitstream to represent a number within [1, 1] by counting the number of ones in the bitstream, has high potential for implementing CNNs with ultralow hardware...