236381 Archives - VLSI Lab

The Development of a JTAG Boundary Scan Structure for VLSI Chips

Many techniques have been developed to simplify the testing of device after production. One technic is called "boundary scan" or sometimes referred to as "JTAG" (Joint Test Action Group). Each device that complies with the standard, which was accepted by the IEEE, includes 5 dedicated pins for testing only (AKA TAP – Test Access Port). In this project the students will take an existing logic project, preferably their own VLSI...

Categories: 236381 | Digital

Design and Implementation of a Superscalar Hack Processor

In this project is to enhance the architecture of the simple "Hack Microprocessor". The goal is not to simply duplicate the data path but to put a lot of thought on the control path and architectural planning to allow the implementation of an optimal architecture. Some challenges include efficient resolution of hazards and branch handling.

Hardware Accelerator for a Machine Learning Naive Bayes Classifier

Project description: A classifier is a machine learning model that is used to distinguish between different objects based on features. The Naive Bayes classifier is very effective in many real-world situations, like document classification and spam filtering. A Naive Bayes classifier is based on applying Bayes’ theorem. It utilizes the “naive” assumption of conditional independence between every pair of features. Despite this simplifying assumption naive Bayes classifiers work very well....

Categories: 236381 | Digital | Machine Learning

Flash Memory Controller Block

Flash memory is widely-used memory technology, used in disk-on-keys, SSDs, set-top boxes (routers, TVs etc.), cellular SIM, and more. Flash memory requires a unique memory controller, as Flash is block-addressable, has unique error handling correction properties, wear leveling management and more. Solid-state drive architectures can arrange Flash chips and controller in several topologies: channels, bus-based, full crossbar and more. In this project, the students will implement a design of controller...

Categories: 236381 | Digital | Memories

Error Correction/RAID Engine for DNA-Based Storage

Project description: DNA digital data storage is defined as the process of encoding and decoding binary data to and from synthesized DNA strands. The global community produces digital data at increasing rates, creating enormous data centers for storage. Recent research proposes replacing the traditional data storage devices with biological DNA-based device, which can store information of the scale of a data-center within a few grams of weight. During DNA synthesis...

Categories: 236381 | Digital | Memories

DNA Memory Enhancement using Signal Processing

Project description: DNA digital data storage is defined as the process of encoding and decoding binary data to and from synthesized DNA strands. The global community produces digital data at increasing rates, creating enormous data centers for storage. Recent research proposes replacing the traditional data storage devices with biological DNA-based device, which can store information of the scale of a data-center within a few grams of weight. During DNA synthesis...

Categories: 236381 | Digital | Memories

A Hardware Accelerator for Unsupervised Learning Based on a Gaussian Mixture Model

Clustering is the task of dividing data points into a number of groups such that data points in the same group are more similar to other data points in the same group than those in other groups. Kmeans is an effective clustering algorithm based on clustering the data points using the minimum distance of the mean of all the points in each cluster. For some datasets, Kmeans does not provide...

Categories: 236381 | Digital | Machine Learning

Convolutional Deep Belief Nets Based on Memristive Devices

Project description: Deep neural networks can be extraordinarily accelerated by using memristive devices as synaptic connections. However, traditionally, the deep neural networks utilize the error backpropagation algorithms, which face some issues when the networks are implemented in hardware based on memristive devices: i) complex peripheral circuits with expensive ADCs and DACs and memory back for intermediate layer states; ii) lack of efficient online training methods. We recently developed an efficient...

Categories: 236381 | 236503 | Machine Learning | Memristors | Software

Novel Synthesis and Mapping of RRAM-based Stateful Logic on 3D RRAM

Background: Computing-in-memory (CiM) has been a potential solution to break the memory wall and energy wall brought by the conventional computer architecture that separates the computing units and memory units. RRAM-based stateful logic is a kind of CiM that could implement any function in RRAM crossbar array. There are some efficient synthesis and mapping methods for 2D RRAM crossbar array. 3D RRAM crossbar arrays are denser and can support stateful...

Categories: 236381 | 236503 | Memories | Memristors | Microprocessors

Approximate Search CAM for DNA sequencing and Genome Analysis

Project description: How can we tell when a new mutation of COVID virus appears? We sequence DNA samples from many patients. These samples contain the host (patient’s) DNA as well as DNAs of multiple viruses and bacteria that live in our body and make their way to the sample. Then, we need to compare huge amounts of sequenced data with existing COVID strains and decide if there is a new...

Categories: 236381 | Digital | Microprocessors

Hardware Acceleration of Local Sensitivity Hashing for Genome Assembly

Project description: High-throughput sequencing have substantially changed the way biological research is performed since the early 2000s. These sequencing technologies obtain millions of short fragments (sequences) of DNA from a living organism to generate the organism’s DNA blueprint (genome). Thanks to these new DNA sequencing platforms, we can now investigate human genome diversity between populations, find genomic variants that are likely to cause diseases and even investigate the genomes of...

Categories: 236381 | Digital | Microprocessors

Hardware Implementation of a Video Processing Superblock Accelerator

Project description: Background: The goal of the project is to design and implement a video processing accelerator to allow real time processing of a video stream. The accelerator will be composed a series of independent video processing units each of which receive a video stream as input and generate a processed video stream at the output which is fed into the next unit. Alpha blending is the process of combining...

Categories: 236381 | Computer Vision | Digital | Multimedia and Signal Processors

Hardware Implementation of the Video Polynomial Transformation + LPF

Project Abstract: There are endless number of platforms that require implementation of video transformations, such as curve TV/computer/smartphone screens, goggles, pilot hamlet, etc. All these platforms require transformation of flat image to curved image that fits the display, so the user can see the image well without data loss. The main challenges of the core implementation are low latency (“video in => video out), high video resolutions and frame rate....

Categories: 236381 | Computer Vision | Digital | Multimedia and Signal Processors

Hardware Acceleration of DBSCAN Clustering

Project description: Clustering is the task of unifying data points into groups or clusters, where the grouping of the points is commonly based as distance. Clustering has many applications including data mining, statistical data analysis, pattern recognition, and more. Two common clustering algorithms are K-Means and Density-Based Spatial Clustering of Applications with Noise (DBSCAN). With increasing needs to perform clustering on large datasets as fast as possible, running these on...

Categories: 236381 | Digital | Machine Learning

Accelerator for DNA Sequence Alignment

Project description: One of the most popular operations in personalized medicine is protein or DNA sequence database search based on pair-wise alignment, where a query sequence is compared with a database of sequences to find a highest-similarity sequence. This similarity can provide insights on the functionality of the query protein or the role of a gene. Conventional computer architecture is proven to be inefficient for personalized medicine tasks. For example,...

Categories: 236381 | Digital | Microprocessors

The Design of a Ring Controller to Support a Multi-core RISC-V Implementation in a Ring Configuration

The goal of this project is to design and implement an RTL IP (System Verilog) that will enable multiple instances of RISC-V cores to be connected in a ring configuration. The IP will consist of two main interfaces – on the one side the “Core” and the other the “Ring”. The Ring Interface will manage the data transactions on the ring - pushing and pulling RD/WR/RD_RSP transactions to/from the ring....

Categories: 236381 | Computer Architecture | Digital | Microprocessors

HW Implementation of MiniBatch Kmeans – A Clustering Algorithm for Unsupervised Learning

Clustering for unsupervised learning is an common task in machine learning systems. Several algorithms can be used for this task, for example K-Means. The main problem with K-means algorithm is the huge amount of computations. Minibatch Kmeans proposes an effective technique to drastically reduce the number of computations with an insignificant impact on the quality of the results. The goal of this project is to design and implement a hardware...

Categories: 236381 | Digital | Machine Learning

Code Optimizer Using Advanced Matrix Extension

The Advanced Matrix Extension (AMX), a new x86 extension designed for operating on matrices with the goal of accelerating machine learning computations. Intel’s Advanced Matrix Extensions (AMX) is a new 64-bit programming paradigm consisting of two components: A set of 2-dimensional registers (tiles) representing sub-arrays from a larger 2-dimensional memory image and an accelerator that is able to operate on tiles. In the first stage of this project, a preprocessor...

Categories: 236381 | 236503 | Digital | Software

Implementation of a DNA Sequencing Accelerator

The DNA Sequencing process involves passing a strand of DNA through the nanopore which causes drops in the electric current passing between the walls of the pore. The amount of change in the current depends on the type of base passing through the pore. This signal is then sampled. In this project, we will design a stand-alone accelerator for the 3rd generation DNA sequence basecalling for personalized medicine applications.

Categories: 236381 | Digital | Machine Learning

Deep Learning Based Controller for SSD Acceleration

Project description: Flash memory is widely-used memory technology, used in disk-on-keys, SSDs, set-top boxes (routers, TVs etc.), cellular SIM, and more. Flash memory requires a unique memory controller, as Flash is block-addressable, has unique error handling correction properties, wear leveling management and more. Solid-state drive architectures can arrange Flash chips and controller in several topologies: channels, bus-based, full crossbar and more. There are several new trends in SSDs that should...

Categories: 236381 | Digital | Machine Learning

The Design of A Secure (Oblivious) Memory

A standard solution to memory security is encrypting all data written to untrusted storage. A big problem with client-side encryption (and other systems that protect only the data itself) is that it does not protect all aspects of how the client interacts with the server's storage. Where storage is accessed, the access pattern can also reveal secret information. Suppose a patient stores his/her genome on a remote server and wishes to check...

Categories: 236381 | Digital | Memories | Microprocessors

Accelerator for Sparse Machine Learning

Sparse linear algebra is a frequent bottleneck in machine learning and data mining workloads. The efficient acceleration of sparse matrix calculations becomes even more critical when applied to big data problems. The goal is to implement an accelerator for multiplying a sparse matrix with a sparse vector. Current solutions fetch from memory all non-zero elements of the sparse matrix. The aim of this project is to implement a technique in...

Categories: 236381 | Digital | Machine Learning | Microprocessors

Tags: 4256

Sort Algorithm for Memristive Memory Processing Unit

The Memristive Memory Processing Unit (mMPU) is a new process-in-memory computer architecture, which performs the computation without moving the data from the computer’s main memory (RAM). The goal of the project is to develop a sort algorithm to run on an mMPU which is based on emerging memory technology of ReRAM.

Design and Implementation of a Hardware Accelerator for Deep Convolutional Auto-Encoder

An advanced scalable hardware accelerator for deep Convolutional Auto-Encoder (CAE), targets deep-learning applications. Integrating a CAE hardware accelerator has advantages in resources occupation, operation speed, and power consumption, indicating great potential for application in digital signal processing. This project suggests building a designated acceleration IP, which efficiently performs RAM-to-RAM calculations in a pipeline fashion and thereby dramatically offloads machine-learning software applications.

Categories: 236381 | Digital | Machine Learning

Systolic Array Acceleration of CNN Per-Channel Activations Quantization

In this project, you are required to design a systolic array that efficiently implements the logic required to support per-channel activation tensor quantization for a convolution neural network. You are required to implement the design using SystemVerilog, simulate and synthesize it after which the layout will be designed. Area, power, and energy will be analyzed and compared to a conventional systolic array. Skills you will acquire: SystemVerilog, Synopsys Design Compiler,...

Categories: 236381 | Digital | Machine Learning

Hardware Implementation of the Video Polynomial Transformation

There are endless number of platforms that require implementation of video transformations, such as curved TV/computer/smartphone screens, goggles, pilot hamlet, etc. All these platforms require transformation of flat image to curved image that fits the display, so the user can see the image well without data loss. The main challenges of the core implementation are low latency (“video in => video out), high video resolutions and frame rate. The goal...

Categories: 236381 | Computer Vision | Digital | Multimedia and Signal Processors

Accelerator for an Unsupervised Learning Machine

This project proposes building a designated accelerator, which efficiently performs RAM-to-RAM calculations in hardware in a pipeline fashion and thereby dramatically reducing CPU load for machine-learning software applications.

Categories: 236381 | Digital | Machine Learning

Detection of Hardware Trojan Horses

Hardware Trojan horses are a real concern for the last 12 years or so, especially for national security. . A few examples of what such a Trojan can do when triggered are : 1. Turn off security protections or insert a known key to the encryption engine; 2. Insert errors to cause malfunction of a critical infrastructure; 3: Leak information to an unprotected zone (for example from a privileged CPU...

Categories: 236381 | Digital | Encryption

Dual Issue RISC-V Processor

RISC-V (pronounced "risk-five") is an open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles.The project began in 2010 at the University of California, Berkeley, but many contributors are volunteers not affiliated with the university. The goal of this project is to evaluate the enhanced performance of the double issue capability.

Categories: 236381 | Computer Architecture | Digital | Microprocessors

Implementation of a Generic Fixed Point Divider

The goal is to design and implement the HDL of a high-performance hardware serial divider for high frequencies. Initially, at least two different division algorithms will be investigated and analyzed. The design will be parametrized so that it can be configured according to specified requirements. The divider will support a variety of input / output number representation formats.

Categories: 236381 | Digital | Multimedia and Signal Processors

Accelerator for ZNCC-Based Template-Matching

Project description:Template Matching is a method for searching and finding the location of a template image in a larger image. It relies on calculating at each position of the image under examination a correlation or distortion function that measures the degree of similarity or dissimilarity to a template sub-image.Among the correlation/distortion functions proposed in literature, Normalized Cross-Correlation (NCC) and Zero mean Normalized Cross Correlation (ZNCC) are widely used due to...

Categories: 236381 | Computer Vision | Digital

NMT – Near Memory Threading Using MTJ Based Multi-state Register

The goal of this project is to use multi-state registers to implement an efficient architecture of Continuous Flow Multi-Threading microprocessor.

Categories: 236381 | Digital | Memristors | Microprocessors

AES Add-on Processor for RISC-V

RISC-V (pronounced "risk-five") is an open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles.The project began in 2010 at the University of California, Berkeley, but many contributors are volunteers not affiliated with the university. RISC-V, pronounced 'Risk-Five', is a new architecture that is available under open, free and non-restrictive licences. It has widespread industry support from chip and device makers, and is designed to...

Categories: 236381 | Digital | Encryption | Microprocessors

Design and Implementation of a Head of Line Blocking (HOLB) Solution

Problem Description: Network routers by nature handle thousands of mega packets per second. Each packet might come from one port and be destined to another port. The actual routing decision is made only once the packet is received and inspected. This scheme by definition, causes head of line blocking, in which one packet destined to a blocked destination completely blocks the input queue or the common processing pipeline. These kinds...

Categories: 236381 | Communication Chips | Digital

Implementation of a RISC-V Processor

RISC-V (pronounced "risk-five") is an open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles.The project began in 2010 at the University of California, Berkeley, but many contributors are volunteers not affiliated with the university. The goal of this project is to study the RISC-V instruction set and then to design and implement a basic RISC-V microprocessor that supports all the instructions. Additional features will...

Categories: 236381 | Digital | Microprocessors

Controller for DNA-Based Data Storage

The global community produces digital data at increasing rates, creating enormous data centers for storage.Recent research proposes replacing the traditional data storage devices with biological DNA-based device, which can store information of the scale of a data-center within a few grams of weight.In this project, the student will study the emerging technological approach, and will implement digital controller circuits for managing DNA storage device. The main goals are understanding of...

Categories: 236381 | Digital | Memories

Backend Implementation of an OFDM Transmitter

The goal of this project is to perform the complete backend design of the OFDM transmitter chip and its integrated memories. This includes : synthesis, gate level simulation, physical (layout) design and verification, timing verification, power and power grid analysis. The chip may then be submitted for fabrication. The implementation will be done in Tower CMOS 0.18u technology.

Categories: 236381 | Backend Design | Digital

Power/ARM to RISC-V Assembly Converter

RISC-V is a classic RISC architecture rebuilt for modern times. At its heart is an array of 32 registers containing the processor's running state, the data being immediately operated on, and housekeeping information. RISC-V comes in 32-bit and 64-bit variants, with register size changing to match. A large amount of code has been developed and written at IBM in assembly for the PowerPC processor for which no C source-code exists....

Categories: 236381 | 236503 | Computer Architecture | Microprocessors | Software

Hardware Reverse Engineering Analyzer Evaluation

Reverse engineering of Integrated Circuits (IC's) is a complex process that involves multiple disciplines and skills. The input to the process is usually a physical device, and the output is a human-readable specification. At the first phase, the IC passes tear down to obtain a gate-level netlist description. In the second phase, a specification is extracted. The second stage is non-trivial and involves various learning algorithms and heuristics. The purpose...

Categories: 236381 | Digital | Encryption

Implementation of a Novel DNN Accelerator with Simultaneous Multi-threading

A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers between the input and output layers. The DNN finds the correct mathematical manipulation to turn the input into the output, whether it be a linear relationship or a non-linear relationship. The goal of this project is to build a novel DNN accelerator with simultaneous multi-threading.

Categories: 236381 | Computer Architecture | Digital | Machine Learning

Accelerator for DNA Sequence Alignment

One of the most popular operations in personalized medicine is protein or DNA sequence database search based on pair-wise alignment, where a query sequence is compared with a database of sequences to find a highest-similarity sequence. This similarity can provide insights on the functionality of the query protein or the role of a gene. Conventional computer architecture is proven to be inefficient for personalized medicine tasks. For example, aligning even...

Categories: 236381 | Digital | Microprocessors

DNA Sequencing Accelerator For Long Read

One of the most popular operations in personalized medicine is protein or DNA sequence database search based on pair-wise alignment, where a query sequence is compared with a database of sequences to find a highest-similarity sequence. OLC-based assembly algorithms focus on finding the read-to-read overlaps, defined to be a common sequence between two reads. A read-to-read overlap is a sequence match between two reads, and occurs when local regions on...

Categories: 236381 | Digital | Microprocessors

Design of Microprocessor using Fast Path-Based Neural Branch Prediction

Modern computer architectures increasingly rely on speculation to boost instruction-level parallelism. One of the common methods is the branch prediction. There are several ways to predict whether a branch is taken or not-taken, which significantly reduce the penalty of the branch. In this project we will develop a branch prediction that is bases on neural-network. The Fast Path-Based Neural Branch Prediction can reach 5% to 7% percent misprediction depending on...

Categories: 236381 | Digital | Microprocessors

Design of Microprocessor with a Decoded Instruction Cache

A CISC decoder is typically set up as a state machine. The machine reads the opcode field to determine what type of instruction it is, and where the other data values are. The instruction word is read in piece by piece, and decisions are made at each stage as to how the remainder of the instruction word will be read. One method to alleviate this is to use a decoded...

Categories: 236381 | Digital | Microprocessors

Implementation of a DNA Sequencing Accelerator

In this project, we will design a stand-alone accelerator for the 3rd generation DNA sequence basecalling for personalized medicine applications.

Categories: 236381 | Digital | Machine Learning | Microprocessors

Design and Implementation of Posit : A Novel Floating Point Format

Design and implementation a Posit Arithmetic Unit supporting posit new format focusing on the stages of regular VLSI design process, namely architecture, HDL implementation, simulation, synthesis and layout.

Categories: 236381 | Digital | Microprocessors

Cyber Protection Chip

The goal of this project is the development of an autonomous cyber protection chip for computer systems and communication channels linked to the cloud. Background: Current technology drives the accelerated development of computer components with increasing processing capabilities, bandwidth and high level of connectivity between components that maintain a constant link to the cloud. Such systems present a significant challenge in protecting the proper operation of the components. The purpose...

Categories: 236381 | Digital | Microprocessors

Systolic Array For Deep Learning

A systolyic array is an homogenous array of identical processors each performing the same function and each connected to several neighbours. Such a structure is very suitable for fast and efficient implementation of machine learning algorithms. The goal of this project is to design and implement an architecture for the computation of the convolution stage of a neural network for deep learning.

Categories: 236381 | Digital | Machine Learning | Microprocessors

Architectural Simulator for Object Oriented Processor

The purpose of this project is to simulate the effect of changing several architecture components on the overall performance of the OOPc processor. Such parameters include: The amount of cores, the amount of simultaneous threads which can run on a core, the sizes of the internal memories and caches and the network on a chip topology.

Deep Convolutional Neural Networks (DCNNs) For Embedded and Portable Systems

Stochastic Computing (SC), which uses a bit-stream to represent a number within [-1, 1] by counting the number of ones in the bit-stream, has high potential for implementing CNNs with ultra-low hardware footprint. Since multiplications and additions can be calculated using AND gates and multiplexers in SC, significant reductions in power (energy) and hardware footprint can be achieved compared to the conventional binary arithmetic implementations. In this project we will...

Categories: 236381 | Digital | Microprocessors

Microprocessor Test Scenario Generation using Machine Learning Techniques

A group in Intel is working on x86 test content optimization and creation using ML techniques. A working solution already exists for test content optimization in production mode. The next stage of the project is to create new content automatically by learning from legacy content (since x86 is backward compatible, huge legacy is available to learn from). Test optimization refers to the compilation of a test suit that achieves the...

Categories: 236381 | 236503 | Digital | Software

Architectural Exploration for Head of Line Blocking (HOLB) Solutions

The goal is to implement a generic system that is allowed to add input queues, output queues, with different parameters in enqueueing/dequeueing elements to/from their queues.

Categories: 236381 | Digital

Tags: 4220

An Advanced Hardware Accelerator for Gradient Descent for Deep-learning

An advanced scalable hardware accelerator for mini batch gradient descent, targets deep-learning applications. Deep neural networks are being widely used in a large number of applications for analyzing and extracting useful information from large amount of data that is being generated every day. Inference and training are the two modes of operation of a neural network. Training is the most computationally challenging task as it involves solving a large-scale optimization...

Categories: 236381 | Digital | Microprocessors

A Real-Time Night Vision Camera Control System

A novel night vision low resolution camera is being developed in Technion. It is based on a thermally isolated floating MOS transistor used to sense temperature changes as a result of external Infrared radiation. When a constant voltage is applied to the transistor, its current signal follows the temperature variations. This current signal is read out and amplified before further processing. This is done by an integrated readout circuit (ROIC).

Categories: 236381 | Digital | General | Software

An All-Digital Demodulator for Implantable Medical Device

רכיבים המושתלים בגוף מתקשרים עם הבקרים החיצויים שלהם בתקשורת אלחוטית. צורת האפנון של הסיגנל המשודר משפיעה על רוחב הפס הנדרש, וכתוצאה מכך על הרעש וקצב השגיאות.

Categories: 236381 | Digital | General

Mixed Signal Readout System for IR Camera Based on Verilog/VHDL

The Technion's innovative TMOS sensors utilize widely available and affordable CMOS-SOI technology together with MEMS micromachining to achieve break-through in passive IR imaging. The CMOS-SOI technology allows the integration of the 2D sensors focal plane array matrix with the analog readout, which is the subject of this project. In this project, you will design, implement and simulate top level architecture for an IR camera system that includes 10x10 matrix of...

Categories: 236381 | Digital | General

Improvement of Power/Performance in RSA Encryption Using CNFET Technology

Although advances with silicon-based electronics continue to be made, alternative technologies are being explored. Digital circuits based on transistors fabricated from carbon nanotubes (CNTs) have the potential to outperform silicon by improving the energy–delay product, a metric of energy efficiency, by more than an order of magnitude. Hence, CNTs are an exciting complement to existing semiconductor technologies. In order to evaluate the potential of CNFETs to replace silicon CMOS technology,...

Categories: 236381 | Digital | General

Reverse Engineering Overcoming Scan Compression Structures

A vast majority of the modern digital VLSI devices utilize a technique called 'full scan' for production testing. This technique concatenates all the device registers (flip-flops or latches) in a few shift registers called 'scan chains'. In this configuration, a production tester may use the scan chains to drive logic values to the inputs of combinatorial circuits, sample the results from their outputs, output the results via the same scan...

Categories: 236381 | Digital | Encryption

Design For Testability (DFT) for Logic System by Scan

A vast majority of the modern digital VLSI devices utilize a technique called 'full scan' for production testing. This technique concatenates all the device registers (flip-flops or latches) in a few shift registers called 'scan chains'. In this configuration, a production tester may use the scan chains to drive logic values to the inputs of combinatorial circuits, sample the results from their outputs, output the results via the same scan...

Categories: 236381 | Digital | Encryption

Design of an RSA Public-Key Encryption Processor

The RSA algorithm stood out among asymmetric encryption systems as a conceptually simple and practical encryption and authentication method which provides a near perfect level of security. Public-key cryptographic systems, such as the RSA often involve modular exponentiation (Z = Ye mod n). This widely used and computational complex operation is performed using successive modular multiplications (C = AB mod n). The performance of such cryptosystems is primarily determined by...

Categories: 236381 | Digital | Encryption

Design of an RSA Public-Key Encryption Processor

The RSA algorithm stood out among asymmetric encryption systems as a conceptually simple and practical encryption and authentication method which provides a near perfect level of security. Public-key cryptographic systems, such as the RSA often involve modular exponentiation (Z = Ye mod n). This widely used and computational complex operation is performed using successive modular multiplications (C = AB mod n). The performance of such cryptosystems is primarily determined by...

Categories: 236381 | Backend Design | Digital

Multi-Channel IO Scheduler For Flash Memory

Modern flash-based memories contain aggressive 19nm scaling of floating-gate transistors. When performing read/write/erase commands in a flash memory, the chip is occupied and cannot be used to perform other commands in parallel. It is sometimes possible to stop the instruction execution in the middle (to perform another instruction) but the penalty of return is a significant slowdown of command execution. The SSD architecture consists of multiple channels. Each has multiple...

Categories: 236381 | Digital | Memories

Implementation of an Advanced Error Correction Algorithm For Flash Memory

Modern flash-based memories contain aggressive 19nm scaling of floating-gate transistors. As a result, data is often stored with errors due to inter-cell interference, coupling, random-telegraph noise and more. The signal-to-noise ratio becomes even worse as density increases. In order to provide reliable data storage, system controller employs error-correcting algorithms. In this project, the students will implement a design of advanced error-correction encoder and decoder. The goal is to study and...

Categories: 236381 | Digital | Memories

NVMe Command Encoder/Decoder

The NVM Express (NVMe) specification was introduced in 2011 and today it is the new standard storage interface for Solid-State Drives (SSD). The NVM Express specification defines a controller interface for PCIe SSD used for Enterprise and Client applications. It is based on a queue mechanism with advanced register interface, command set and feature set including error logging, status, system monitoring (SMART, health), and firmware management). The southbridge is one...

Categories: 236381 | Digital | Memories

Write-Once Memory (WOM)

Write-Once Memory (WOM) code enable to transform information such that consecutive writes to the memory would have uni-directional transition of bits. This property is useful for SSD memory since it reduces the number of program-erase cycles, thus it increases the memory endurance and might and also performance impact. In this project, the students will do analysis/trade-off of new WOM codes efficiency and power/area/throughput comparison. The goal is to implement a...

Categories: 236381 | Digital | Memories

Architectural Exploration for Head of Line Blocking (HOLB) Solutions

Problem Definition: Network routers by nature handle thousands of mega packets per second. Each packet might come from one port and be destined to another port. The actual routing decision is made only once the packet is received and inspected. This scheme by definition, causes head of line blocking, in which one packet destined to a blocked destination completely blocks the input queue or the common processing pipeline. These kinds...

Categories: 236381 | Communication Chips | Digital

Implementation of a Digital OFDM Transciever

Orthogonal Frequency Division Multiplexing (OFDM) is a Frequency Division Multiplexing (FDM) technique used as a digital multi-carrier modulation method. Instead of using one high speed channel, the data is split into a large number of lower speed channels. Orthogonal sub carriers are used to carry data on several parallel data streams which allows more efficient use of the spectrum compare to regular FDM. Orthogonality of the carriers prevents interference between...

Categories: 236381 | Communication Chips | Digital

Implementation of a Smallest Univalue Segment Assimilating Nucleus (SUSAN) Block

Edge and feature extraction is one of the most important first steps in computer vision. Its main objective is to find as many useful features from a scene while keeping the output noise level to a minimum. Edge, corner and vertex detection processes serve to simplify the analysis of images by drastically reducing the amount of data to be processed. The SUSAN principle is the basis for algorithms to perform...

Categories: 236381 | Digital | Multimedia and Signal Processors

Design of an Arithmetic Logic Unit using Complementary Memristor Ratioed Logic

Memristors are resistive devices with varying resistance which depends on the voltage applied to the device. The most natural memristor application is memory. However memristors can also be used for other applications, for example logic circuits. Once such approach is MRL (Memristor Ratioed Logic) - a hybrid CMOS-memristive logic family. In MRL, OR and AND logic gates are designed using memristors. The limitation of MRL is that every memristor-based logic...

Categories: 236381 | Digital | Microprocessors

Accelerator for Sparse Machine Learning

Sparse linear algebra is a frequent bottleneck in machine learning and data mining workloads. The efficient acceleration of sparse matrix calculations becomes even more critical when applied to big data problems.

Categories: 236381 | Digital | Microprocessors

Accelerator for GNSS Acquisition and Tracking

An advanced Global-Navigation-Satellite-System (GNSS) accelerator, which provides the end user with improved position, velocity and time solutions. High performance conventional GPS/GNSS receivers rely on ASIC technology to implement massive correlators, as the performance of SDR solutions is still limited. With a reasonable distribution of tasks between the host hardware and reconfigurable peripherals, a higher performance is achieved. The figure illustrates a schematic structure of a GNSS receiver, where the proposed...

Categories: 236381 | Digital | Microprocessors

מימוש מעבד נתונים ייעודי לצורך ביצוע רגולציה בזמן אמת על מסחר מבוסס מכונה בשוק ההון

מערכות הסוחרות באופן אוטומטי בניירות ערך שינו מן היסוד את פעילות שוק ההון בשנים האחרונות. רוב המסחר בבורסות האמריקאיות מתנהל כיום ללא כל מעורבות אנושית. מכונות המסחר יכולות להיות מתוכננות לסחור במניות, אופציות, חוזים עתידיים ומוצרי מט"ח המבוססים על אוסף של כללים מוגדר מראש הקובעים מתי לקנות, מתי למכור וכמה כסף להשקיע בכל מוצר מסחר. מערכות המסחר האוטומטיות הולכות ומשתכללות תוך עיבוד נתונים בכמות ובקצב הולכים וגדלים יחד עם קיצור...

Categories: 236381 | Digital | Microprocessors

Design of Quad-Core Microprocessor

As the manufacturing technologies of VLSI progresses, HW architects are constantly looking for ways to improve overall performance of the CPU. In the past, many small scale architecture improvements, as well as pipelines, and other methods were used to improve performance. Other methods were increasing clock frequency and the width of data-bus, from 16 bit to 32, 64 and higher. As the manufacturing processes become more and more dense, and...

Categories: 236381 | Digital | Microprocessors

Design of an RSA Public-Key Encryption Processor

The RSA algorithm stood out among asymmetric encryption systems as a conceptually simple and practical encryption and authentication method which provides a near perfect level of security. Public-key cryptographic systems, such as the RSA often involve modular exponentiation (Z = Ye mod n). This widely used and computational complex operation is performed using successive modular multiplications (C = AB mod n). The performance of such cryptosystems is primarily determined by...

Categories: 236381 | Digital | Microprocessors

Implementation of a Smallest Univalue Segment Assimilating Nucleus (SUSAN) Block

The goal of this project is to develop a variation of the SUSAN feature detection algorithm which can be implemented with digital processing on special purpose hardware. The goal of this algorithm will be focused on corner detection.

Categories: 236381 | Digital | Multimedia and Signal Processors

Tags: 3184

3D Flash Memory Management

The goal of this project is to design an algorithm to detect and correct such errors. The scheme relies on a coding technique that incorporates the side information of fast detrapping during the encoding stage. The implementation includes matlab modeling, spec and architecture definition, logic design using the Verilog HDL, verification and synthesis.

Categories: 236381 | Digital

Tags: 4117

Emerging Memory Technology Controller

The goal of this project is to develop algorithms for performance enhancement/cost reduction and implement it on HDL for related memory controller. The implementation includes matlab modeling, spec and architecture definition, logic design using the Verilog HDL, verification and synthesis. The emphasis of this project will be on low latency of the design.

Categories: 236381 | Digital

Tags: 4116

Hack Proof RSA Public Key Encryption System

Description: In the field of cryptanalysis the tools utilized to recover the secret information are very different from the ones utilized to build the cipher. For the most part cryptanalysis is based on probabilistic Bayesian techniques. In this method some information leaked from the system is exploited in order to derive a slight probability advantage of one code over another. Accordingly, after a sufficient number of ciphertext messages are analyzed,...

Categories: 236381 | Digital

Tags: 3891

Implementation of Matrix Inversion on GP-SIMD Processor

The goal of this project is to familiarize a future VLSI designer with a variety of cutting edge parallel programming techniques while working with an advanced massively-parallel in-memory computer.

Categories: 236381 | Digital

Implementation of PCIe Storage Protocols

In this project, the students will implement a hardware design of PC to storage communication: using PCIe hardware standards and new emerging NVMe protocol.

Categories: 236381 | Digital | Memories

Implementation of a Digital OFDM Transciever

The goal of this project is to design and implement an ASIC which includes a 32 channel OFDM transmitter and receiver.

Categories: 236381 | Digital

Backend Implementation of an OpenSPARC T1-Based SoC

Description: The project is an OpenSPARC T1-based SoC which includes: – Full or reduced OpenSPARC T1 CPU core – OpenSPARC FPU – Bridge to connect the CPU and FPU to the Whisbone bus – Nor flash controller – UART – OpenCores ethernet controller – Bridges from Whishbone to Altera and Xilinx DRAM controllers The goal of this project is to perform the complete backend design of a OpenSPARC T1 microprocessor...

Categories: 236381 | Digital

Tags: tag2