Background information: Recently, several different memristive technologies (ReRAM, CBRAM, PCM and STTMRAM) have emerged as promising candidates for digital and analog inmemory computation. Deep neural networks (DNNs) are one of the main application to benefit from analog inmemory computation. However, the noisy nature of analog computation may let to performance (“accuracy”) degradation. In this project, you will use IBM analog hardware acceleration kit, a kit developed by IBM to simulate...
Machine Learning


Clustering is the task of dividing data points into a number of groups such that data points in the same group are more similar to other data points in the same group than those in other groups. Kmeans is an effective clustering algorithm based on clustering the data points using the minimum distance of the mean of all the points in each cluster. For some datasets, Kmeans does not provide...

Project description: Deep neural networks can be extraordinarily accelerated by using memristive devices as synaptic connections. However, traditionally, the deep neural networks utilize the error backpropagation algorithms, which face some issues when the networks are implemented in hardware based on memristive devices: i) complex peripheral circuits with expensive ADCs and DACs and memory back for intermediate layer states; ii) lack of efficient online training methods. We recently developed an efficient...

Project description: A classifier is a machine learning model that is used to distinguish between different objects based on features. The Naive Bayes classifier is very effective in many realworld situations, like document classification and spam filtering. A Naive Bayes classifier is based on applying Bayes’ theorem. It utilizes the “naive” assumption of conditional independence between every pair of features. Despite this simplifying assumption naive Bayes classifiers work very well....

Project description: Clustering is the task of unifying data points into groups or clusters, where the grouping of the points is commonly based as distance. Clustering has many applications including data mining, statistical data analysis, pattern recognition, and more. Two common clustering algorithms are KMeans and DensityBased Spatial Clustering of Applications with Noise (DBSCAN). With increasing needs to perform clustering on large datasets as fast as possible, running these on...

Clustering for unsupervised learning is an common task in machine learning systems. Several algorithms can be used for this task, for example KMeans. The main problem with Kmeans algorithm is the huge amount of computations. Minibatch Kmeans proposes an effective technique to drastically reduce the number of computations with an insignificant impact on the quality of the results. The goal of this project is to design and implement a hardware...

The DNA Sequencing process involves passing a strand of DNA through the nanopore which causes drops in the electric current passing between the walls of the pore. The amount of change in the current depends on the type of base passing through the pore. This signal is then sampled. In this project, we will design a standalone accelerator for the 3rd generation DNA sequence basecalling for personalized medicine applications.

Project description: Flash memory is widelyused memory technology, used in diskonkeys, SSDs, settop boxes (routers, TVs etc.), cellular SIM, and more. Flash memory requires a unique memory controller, as Flash is blockaddressable, has unique error handling correction properties, wear leveling management and more. Solidstate drive architectures can arrange Flash chips and controller in several topologies: channels, busbased, full crossbar and more. There are several new trends in SSDs that should...

Sparse linear algebra is a frequent bottleneck in machine learning and data mining workloads. The efficient acceleration of sparse matrix calculations becomes even more critical when applied to big data problems. The goal is to implement an accelerator for multiplying a sparse matrix with a sparse vector. Current solutions fetch from memory all nonzero elements of the sparse matrix. The aim of this project is to implement a technique in...

An advanced scalable hardware accelerator for deep Convolutional AutoEncoder (CAE), targets deeplearning applications. Integrating a CAE hardware accelerator has advantages in resources occupation, operation speed, and power consumption, indicating great potential for application in digital signal processing. This project suggests building a designated acceleration IP, which efficiently performs RAMtoRAM calculations in a pipeline fashion and thereby dramatically offloads machinelearning software applications.

In this project, theories of the cellular nonlinear network will be studied and the possibilities of using memristive devices in these networks will be investigated. A software model of prototype cellular nonlinear neural network accounting for the behaviors of memristive devices as the synaptic connections will be implemented and a series of simulations will be performed.

In this project, you are required to design a systolic array that efficiently implements the logic required to support perchannel activation tensor quantization for a convolution neural network. You are required to implement the design using SystemVerilog, simulate and synthesize it after which the layout will be designed. Area, power, and energy will be analyzed and compared to a conventional systolic array. Skills you will acquire: SystemVerilog, Synopsys Design Compiler,...

Ferroelectric Field Effect Transistor (FeFET) memory has shown the potential to meet the requirements of the growing need for fast, dense, low power and nonvolatile memories.Integrating a layer of ferroelectric within the gate stack of a regular Field Effect Transistor (FET) enables the transistor to store data in the polarization state of the ferroelectric. In this project, we look for appropriate application of binary neural network (BNN) which can benefit...

This project proposes building a designated accelerator, which efficiently performs RAMtoRAM calculations in hardware in a pipeline fashion and thereby dramatically reducing CPU load for machinelearning software applications.

A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers between the input and output layers. The DNN finds the correct mathematical manipulation to turn the input into the output, whether it be a linear relationship or a nonlinear relationship. The goal of this project is to build a novel DNN accelerator with simultaneous multithreading.

This project proposes building a designated accelerator, which efficiently performs RAMtoRAM calculations in hardware in a pipeline fashion and thereby dramatically reducing CPU load for machinelearning software applications.

In this project, we will design a standalone accelerator for the 3rd generation DNA sequence basecalling for personalized medicine applications.

Neural networks is a rapidly emerging field. The goal of this project is to perform placement of standard cells in VLSI circuits with neural networks as described in the paper "Neural Network Based Approach to cell Placement" which uses Artificial Neural Network techniques in order to do the cell placement.

A systolyic array is an homogenous array of identical processors each performing the same function and each connected to several neighbours. Such a structure is very suitable for fast and efficient implementation of machine learning algorithms. The goal of this project is to design and implement an architecture for the computation of the convolution stage of a neural network for deep learning.

With the recent advance of wearable devices and Internet of Things (IoTs), it becomes attractive to implement the Deep Convolutional Neural Networks (DCNNs) in embedded and portable systems. Currently, executing the softwarebased DCNNs requires highperformance and highpower servers. Stochastic Computing (SC), which uses a bitstream to represent a number within [1, 1] by counting the number of ones in the bitstream, has high potential for implementing CNNs with ultralow hardware...