Microprocessors Archives

AXI Bus Scaler

The idea is designing an AXI bus down-sizer (for Read and Write transactions) from any (power-of-two bits) data width source to any (power-of-two bits) data width target, while maintaining (1) the integrity of the transactions, (2) the full data rate (enforced by the narrower side of course) for a predefined outstanding transactions level, and (3) the best optimization accessing any endpoint IP.The project will address AXI3, AXI4, AXI5 flavors (same...

Design and Implementation of a Superscalar Hack Processor

In this project is to enhance the architecture of the simple "Hack Microprocessor". The goal is not to simply duplicate the data path but to put a lot of thought on the control path and architectural planning to allow the implementation of an optimal architecture. Some challenges include efficient resolution of hazards and branch handling.

Categories: 236381 | Computer Architecture | Digital | Microprocessors

Implementation of an LOTR RISC-V Based System on Chip (SOC) on an FPGA

Project description: This project will focus on the implementation of an “Embedded System” which includes a System Verilog SOC design with cores, memory, accelerators, NOC (network on chip) etc. The students will work on FPGA Altera devices on which they will implement the LOTR-RISC-V fabric. Using the MMIO(Memory Mapped IO) UART/TAP interface the student will enable the FPGA to communicate with the computer via terminal and Python scripts. This project...

Categories: Digital | Microprocessors

Novel Synthesis and Mapping of RRAM-based Stateful Logic on 3D RRAM

Background: Computing-in-memory (CiM) has been a potential solution to break the memory wall and energy wall brought by the conventional computer architecture that separates the computing units and memory units. RRAM-based stateful logic is a kind of CiM that could implement any function in RRAM crossbar array. There are some efficient synthesis and mapping methods for 2D RRAM crossbar array. 3D RRAM crossbar arrays are denser and can support stateful...

Categories: 236381 | 236503 | Memories | Memristors | Microprocessors

Approximate Search CAM for DNA sequencing and Genome Analysis

Project description: How can we tell when a new mutation of COVID virus appears? We sequence DNA samples from many patients. These samples contain the host (patient’s) DNA as well as DNAs of multiple viruses and bacteria that live in our body and make their way to the sample. Then, we need to compare huge amounts of sequenced data with existing COVID strains and decide if there is a new...

Categories: 236381 | Digital | Microprocessors

Hardware Acceleration of Local Sensitivity Hashing for Genome Assembly

Project description: High-throughput sequencing have substantially changed the way biological research is performed since the early 2000s. These sequencing technologies obtain millions of short fragments (sequences) of DNA from a living organism to generate the organism’s DNA blueprint (genome). Thanks to these new DNA sequencing platforms, we can now investigate human genome diversity between populations, find genomic variants that are likely to cause diseases and even investigate the genomes of...

Categories: 236381 | Digital | Microprocessors

Accelerator for DNA Sequence Alignment

Project description: One of the most popular operations in personalized medicine is protein or DNA sequence database search based on pair-wise alignment, where a query sequence is compared with a database of sequences to find a highest-similarity sequence. This similarity can provide insights on the functionality of the query protein or the role of a gene. Conventional computer architecture is proven to be inefficient for personalized medicine tasks. For example,...

Categories: 236381 | Digital | Microprocessors

The Design of a Ring Controller to Support a Multi-core RISC-V Implementation in a Ring Configuration

The goal of this project is to design and implement an RTL IP (System Verilog) that will enable multiple instances of RISC-V cores to be connected in a ring configuration. The IP will consist of two main interfaces – on the one side the “Core” and the other the “Ring”. The Ring Interface will manage the data transactions on the ring - pushing and pulling RD/WR/RD_RSP transactions to/from the ring....

Categories: 236381 | Computer Architecture | Digital | Microprocessors

The Design the SW stack for a Multi-Core RING Architecture + Proof of Concept “Distributed Computing”

The project is to develop the SW stack for the Multi-Core RING Architecture. (C – without any external libraries). In the project the students will design a SW library for “Distributed Computing" using the embedded RING architecture.

Categories: 236503 | Microprocessors | Software

The Design of A Secure (Oblivious) Memory

A standard solution to memory security is encrypting all data written to untrusted storage. A big problem with client-side encryption (and other systems that protect only the data itself) is that it does not protect all aspects of how the client interacts with the server's storage. Where storage is accessed, the access pattern can also reveal secret information. Suppose a patient stores his/her genome on a remote server and wishes to check...

Categories: 236381 | Digital | Memories | Microprocessors

Accelerator for Sparse Machine Learning

Sparse linear algebra is a frequent bottleneck in machine learning and data mining workloads. The efficient acceleration of sparse matrix calculations becomes even more critical when applied to big data problems. The goal is to implement an accelerator for multiplying a sparse matrix with a sparse vector. Current solutions fetch from memory all non-zero elements of the sparse matrix. The aim of this project is to implement a technique in...

Categories: 236381 | Digital | Machine Learning | Microprocessors

Tags: 4256

Sort Algorithm for Memristive Memory Processing Unit

The Memristive Memory Processing Unit (mMPU) is a new process-in-memory computer architecture, which performs the computation without moving the data from the computer’s main memory (RAM). The goal of the project is to develop a sort algorithm to run on an mMPU which is based on emerging memory technology of ReRAM.

Electromigration-Aware Architecture for Modern Microprocessors

In this project, we propose a new architecture that significantly improves reliability by reducing EM impact while relaxing the physical design efforts and significantly extending microprocessor lifetime. It is based on the observation that in many cases EM reliability issues result from excessive write activities or signals toggling in a non uniform manner. We will examine EM improvement to 3 main components of microprocessors: ALU execution unit, register file and...

Categories: Computer Architecture | Digital | Microprocessors

Dual Issue RISC-V Processor

RISC-V (pronounced "risk-five") is an open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles.The project began in 2010 at the University of California, Berkeley, but many contributors are volunteers not affiliated with the university. The goal of this project is to evaluate the enhanced performance of the double issue capability.

Categories: 236381 | Computer Architecture | Digital | Microprocessors

Risc-V Based Emerging Memory SW Emulator – System Design

Recently many novel memory technologies are emerging. For example ReRAM, STT-MRAM, DRAM. All these technologies are very suitable for in memory processing. Unfortunately, at the present time, there are no actual devices and so simulating in-memory processing with these technologies is very difficult. The goal of this project is to provide a model which can be used to perform these simulations.

Categories: Digital | Memories | Microprocessors

Risc-V Based Emerging Memory SW Emulator

Recently many novel memory technologies are emerging. For example ReRAM, STT-MRAM, DRAM. All these technologies are very suitable for in memory processing. Unfortunately, at the present time, there are no actual devices and so simulating in-memory processing with these technologies is very difficult. The goal of this project is to provide a model which can be used to perform these simulations.

Categories: Digital | Memories | Microprocessors

NMT – Near Memory Threading Using MTJ Based Multi-state Register

The goal of this project is to use multi-state registers to implement an efficient architecture of Continuous Flow Multi-Threading microprocessor.

Categories: 236381 | Digital | Memristors | Microprocessors

AES Add-on Processor for RISC-V

RISC-V (pronounced "risk-five") is an open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles.The project began in 2010 at the University of California, Berkeley, but many contributors are volunteers not affiliated with the university. RISC-V, pronounced 'Risk-Five', is a new architecture that is available under open, free and non-restrictive licences. It has widespread industry support from chip and device makers, and is designed to...

Categories: 236381 | Digital | Encryption | Microprocessors

Implementation of a RISC-V Processor

RISC-V (pronounced "risk-five") is an open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles.The project began in 2010 at the University of California, Berkeley, but many contributors are volunteers not affiliated with the university. The goal of this project is to study the RISC-V instruction set and then to design and implement a basic RISC-V microprocessor that supports all the instructions. Additional features will...

Categories: 236381 | Digital | Microprocessors

Power/ARM to RISC-V Assembly Converter

RISC-V is a classic RISC architecture rebuilt for modern times. At its heart is an array of 32 registers containing the processor's running state, the data being immediately operated on, and housekeeping information. RISC-V comes in 32-bit and 64-bit variants, with register size changing to match. A large amount of code has been developed and written at IBM in assembly for the PowerPC processor for which no C source-code exists....

Categories: 236381 | 236503 | Computer Architecture | Microprocessors | Software

Accelerator for DNA Sequence Alignment

One of the most popular operations in personalized medicine is protein or DNA sequence database search based on pair-wise alignment, where a query sequence is compared with a database of sequences to find a highest-similarity sequence. This similarity can provide insights on the functionality of the query protein or the role of a gene. Conventional computer architecture is proven to be inefficient for personalized medicine tasks. For example, aligning even...

Categories: 236381 | Digital | Microprocessors

Accelerator for Machine Learning System

This project proposes building a designated accelerator, which efficiently performs RAM-to-RAM calculations in hardware in a pipeline fashion and thereby dramatically reducing CPU load for machine-learning software applications.

Categories: Digital | Machine Learning | Microprocessors

DNA Sequencing Accelerator For Long Read

One of the most popular operations in personalized medicine is protein or DNA sequence database search based on pair-wise alignment, where a query sequence is compared with a database of sequences to find a highest-similarity sequence. OLC-based assembly algorithms focus on finding the read-to-read overlaps, defined to be a common sequence between two reads. A read-to-read overlap is a sequence match between two reads, and occurs when local regions on...

Categories: 236381 | Digital | Microprocessors

Design of Microprocessor using Fast Path-Based Neural Branch Prediction

Modern computer architectures increasingly rely on speculation to boost instruction-level parallelism. One of the common methods is the branch prediction. There are several ways to predict whether a branch is taken or not-taken, which significantly reduce the penalty of the branch. In this project we will develop a branch prediction that is bases on neural-network. The Fast Path-Based Neural Branch Prediction can reach 5% to 7% percent misprediction depending on...

Categories: 236381 | Digital | Microprocessors

Design of Microprocessor with a Decoded Instruction Cache

A CISC decoder is typically set up as a state machine. The machine reads the opcode field to determine what type of instruction it is, and where the other data values are. The instruction word is read in piece by piece, and decisions are made at each stage as to how the remainder of the instruction word will be read. One method to alleviate this is to use a decoded...

Categories: 236381 | Digital | Microprocessors

Implementation of a DNA Sequencing Accelerator

In this project, we will design a stand-alone accelerator for the 3rd generation DNA sequence basecalling for personalized medicine applications.

Categories: 236381 | Digital | Machine Learning | Microprocessors

Design and Implementation of Posit : A Novel Floating Point Format

Design and implementation a Posit Arithmetic Unit supporting posit new format focusing on the stages of regular VLSI design process, namely architecture, HDL implementation, simulation, synthesis and layout.

Categories: 236381 | Digital | Microprocessors

Cyber Protection Chip

The goal of this project is the development of an autonomous cyber protection chip for computer systems and communication channels linked to the cloud. Background: Current technology drives the accelerated development of computer components with increasing processing capabilities, bandwidth and high level of connectivity between components that maintain a constant link to the cloud. Such systems present a significant challenge in protecting the proper operation of the components. The purpose...

Categories: 236381 | Digital | Microprocessors

Systolic Array For Deep Learning

A systolyic array is an homogenous array of identical processors each performing the same function and each connected to several neighbours. Such a structure is very suitable for fast and efficient implementation of machine learning algorithms. The goal of this project is to design and implement an architecture for the computation of the convolution stage of a neural network for deep learning.

Categories: 236381 | Digital | Machine Learning | Microprocessors

Architectural Simulator for Object Oriented Processor

The purpose of this project is to simulate the effect of changing several architecture components on the overall performance of the OOPc processor. Such parameters include: The amount of cores, the amount of simultaneous threads which can run on a core, the sizes of the internal memories and caches and the network on a chip topology.

Deep Convolutional Neural Networks (DCNNs) For Embedded and Portable Systems

Stochastic Computing (SC), which uses a bit-stream to represent a number within [-1, 1] by counting the number of ones in the bit-stream, has high potential for implementing CNNs with ultra-low hardware footprint. Since multiplications and additions can be calculated using AND gates and multiplexers in SC, significant reductions in power (energy) and hardware footprint can be achieved compared to the conventional binary arithmetic implementations. In this project we will...

Categories: 236381 | Digital | Microprocessors

An Advanced Hardware Accelerator for Gradient Descent for Deep-learning

An advanced scalable hardware accelerator for mini batch gradient descent, targets deep-learning applications. Deep neural networks are being widely used in a large number of applications for analyzing and extracting useful information from large amount of data that is being generated every day. Inference and training are the two modes of operation of a neural network. Training is the most computationally challenging task as it involves solving a large-scale optimization...

Categories: 236381 | Digital | Microprocessors

Design of an Arithmetic Logic Unit using Complementary Memristor Ratioed Logic

Memristors are resistive devices with varying resistance which depends on the voltage applied to the device. The most natural memristor application is memory. However memristors can also be used for other applications, for example logic circuits. Once such approach is MRL (Memristor Ratioed Logic) - a hybrid CMOS-memristive logic family. In MRL, OR and AND logic gates are designed using memristors. The limitation of MRL is that every memristor-based logic...

Categories: 236381 | Digital | Microprocessors

Accelerator for Sparse Machine Learning

Sparse linear algebra is a frequent bottleneck in machine learning and data mining workloads. The efficient acceleration of sparse matrix calculations becomes even more critical when applied to big data problems.

Categories: 236381 | Digital | Microprocessors

Accelerator for GNSS Acquisition and Tracking

An advanced Global-Navigation-Satellite-System (GNSS) accelerator, which provides the end user with improved position, velocity and time solutions. High performance conventional GPS/GNSS receivers rely on ASIC technology to implement massive correlators, as the performance of SDR solutions is still limited. With a reasonable distribution of tasks between the host hardware and reconfigurable peripherals, a higher performance is achieved. The figure illustrates a schematic structure of a GNSS receiver, where the proposed...

Categories: 236381 | Digital | Microprocessors

מימוש מעבד נתונים ייעודי לצורך ביצוע רגולציה בזמן אמת על מסחר מבוסס מכונה בשוק ההון

מערכות הסוחרות באופן אוטומטי בניירות ערך שינו מן היסוד את פעילות שוק ההון בשנים האחרונות. רוב המסחר בבורסות האמריקאיות מתנהל כיום ללא כל מעורבות אנושית. מכונות המסחר יכולות להיות מתוכננות לסחור במניות, אופציות, חוזים עתידיים ומוצרי מט"ח המבוססים על אוסף של כללים מוגדר מראש הקובעים מתי לקנות, מתי למכור וכמה כסף להשקיע בכל מוצר מסחר. מערכות המסחר האוטומטיות הולכות ומשתכללות תוך עיבוד נתונים בכמות ובקצב הולכים וגדלים יחד עם קיצור...

Categories: 236381 | Digital | Microprocessors

Design of Quad-Core Microprocessor

As the manufacturing technologies of VLSI progresses, HW architects are constantly looking for ways to improve overall performance of the CPU. In the past, many small scale architecture improvements, as well as pipelines, and other methods were used to improve performance. Other methods were increasing clock frequency and the width of data-bus, from 16 bit to 32, 64 and higher. As the manufacturing processes become more and more dense, and...

Categories: 236381 | Digital | Microprocessors

Design of an RSA Public-Key Encryption Processor

The RSA algorithm stood out among asymmetric encryption systems as a conceptually simple and practical encryption and authentication method which provides a near perfect level of security. Public-key cryptographic systems, such as the RSA often involve modular exponentiation (Z = Ye mod n). This widely used and computational complex operation is performed using successive modular multiplications (C = AB mod n). The performance of such cryptosystems is primarily determined by...

Categories: 236381 | Digital | Microprocessors