• The idea is designing an AXI bus down-sizer (for Read and Write transactions) from any (power-of-two bits) data width source to any (power-of-two bits) data width target, while maintaining (1) the integrity of the transactions, (2) the full data rate (enforced by the narrower side of course) for a predefined outstanding transactions level, and (3) the best optimization accessing any endpoint IP.The project will address AXI3, AXI4, AXI5 flavors (same...
  • Project description: This project will focus on the implementation of an “Embedded System” which includes a System Verilog SOC design with cores, memory, accelerators, NOC (network on chip) etc. The students will work on FPGA Altera devices on which they will implement the LOTR-RISC-V fabric. Using the MMIO(Memory Mapped IO) UART/TAP interface the student will enable the FPGA to communicate with the computer via terminal and Python scripts. This project...
    Categories: |
  • Background: Computing-in-memory (CiM) has been a potential solution to break the memory wall and energy wall brought by the conventional computer architecture that separates the computing units and memory units. RRAM-based stateful logic is a kind of CiM that could implement any function in RRAM crossbar array. There are some efficient synthesis and mapping methods for 2D RRAM crossbar array. 3D RRAM crossbar arrays are denser and can support stateful...
  • Project description: How can we tell when a new mutation of COVID virus appears? We sequence DNA samples from many patients. These samples contain the host (patient’s) DNA as well as DNAs of multiple viruses and bacteria that live in our body and make their way to the sample. Then, we need to compare huge amounts of sequenced data with existing COVID strains and decide if there is a new...
    Categories: | |
  • Project description: High-throughput sequencing have substantially changed the way biological research is performed since the early 2000s.  These sequencing technologies obtain millions of short fragments (sequences) of DNA from a living organism to generate the organism’s DNA blueprint (genome). Thanks to these new DNA sequencing platforms, we can now investigate human genome diversity between populations, find genomic variants that are likely to cause diseases and even investigate the genomes of...
    Categories: | |
  • Project description: One of the most popular operations in personalized medicine is protein or DNA sequence database search based on pair-wise alignment, where a query sequence is compared with a database of sequences to find a highest-similarity sequence. This similarity can provide insights on the functionality of the query protein or the role of a gene. Conventional computer architecture is proven to be inefficient for personalized medicine tasks. For example,...
    Categories: | |
  • The goal of this project is to design and implement an RTL IP (System Verilog) that will enable multiple instances of RISC-V cores to be connected in a ring configuration. The IP will consist of two main interfaces – on the one side the “Core” and the other the “Ring”. The Ring Interface will manage the data transactions on the ring - pushing and pulling RD/WR/RD_RSP transactions to/from the ring....
  • A standard solution to memory security is encrypting all data written to untrusted storage. A big problem with client-side encryption (and other systems that protect only the data itself) is that it does not protect all aspects of how the client interacts with the server's storage. Where storage is accessed, the access pattern can also reveal secret information. Suppose a patient stores his/her genome on a remote server and wishes to check...
  • Sparse linear algebra is a frequent bottleneck in machine learning and data mining workloads. The efficient acceleration of sparse matrix calculations becomes even more critical when applied to big data problems. The goal is to implement an accelerator for multiplying a sparse matrix with a sparse vector. Current solutions fetch from memory all non-zero elements of the sparse matrix. The aim of this project is to implement a technique in...
    Tags:
  • In this project, we propose a new architecture that significantly improves reliability by reducing EM impact while relaxing the physical design efforts and significantly extending microprocessor lifetime. It is based on the observation that in many cases EM reliability issues result from excessive write activities or signals toggling in a non uniform manner.  We will examine EM improvement to 3 main components of microprocessors: ALU execution unit, register file and...
  • RISC-V (pronounced "risk-five") is an open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles.The project began in 2010 at the University of California, Berkeley, but many contributors are volunteers not affiliated with the university. The goal of this project is to evaluate the enhanced performance of the double issue capability.
  • Recently many novel memory technologies are emerging. For example ReRAM, STT-MRAM, DRAM. All these technologies are very suitable for  in memory processing. Unfortunately, at the present time, there  are no actual devices and so simulating in-memory processing with these technologies is very difficult. The goal of this project is to provide a model which can be used to perform these simulations.
    Categories: | |
  • Recently many novel memory technologies are emerging. For example ReRAM, STT-MRAM, DRAM. All these technologies are very suitable for  in memory processing. Unfortunately, at the present time, there  are no actual devices and so simulating in-memory processing with these technologies is very difficult. The goal of this project is to provide a model which can be used to perform these simulations.
    Categories: | |
  • RISC-V (pronounced "risk-five") is an open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles.The project began in 2010 at the University of California, Berkeley, but many contributors are volunteers not affiliated with the university. RISC-V, pronounced 'Risk-Five', is a new architecture that is available under open, free and non-restrictive licences. It has widespread industry support from chip and device makers, and is designed to...
  • RISC-V (pronounced "risk-five") is an open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles.The project began in 2010 at the University of California, Berkeley, but many contributors are volunteers not affiliated with the university. The goal of this project is to study the RISC-V instruction set and then to design and implement a basic RISC-V microprocessor that supports all the instructions. Additional features will...
    Categories: | |
  • RISC-V is a classic RISC architecture rebuilt for modern times. At its heart is an array of 32 registers containing the processor's running state, the data being immediately operated on, and housekeeping information.  RISC-V comes in 32-bit and 64-bit variants, with register size changing to match. A large amount of code has  been developed and written at IBM in assembly for the PowerPC processor for which no C source-code exists....
  • One of the most popular operations in personalized medicine is protein or DNA sequence database search based on pair-wise alignment, where a query sequence is compared with a database of sequences to find a highest-similarity sequence. This similarity can provide insights on the functionality of the query protein or the role of a gene. Conventional computer architecture is proven to be inefficient for personalized medicine tasks. For example, aligning even...
    Categories: | |
  • One of the most popular operations in personalized medicine is protein or DNA sequence database search based on pair-wise alignment, where a query sequence is compared with a database of sequences to find a highest-similarity sequence. OLC-based assembly algorithms focus on finding the read-to-read overlaps, defined to be a common sequence between two reads. A read-to-read overlap is a sequence match between two reads, and occurs when local regions on...
    Categories: | |
  • Modern computer architectures increasingly rely on speculation to boost instruction-level parallelism. One of the common methods is the branch prediction. There are several ways to predict whether a branch is taken or not-taken, which significantly reduce the penalty of the branch. In this project we will develop a branch prediction that is bases on neural-network. The Fast Path-Based Neural Branch Prediction can reach 5% to 7% percent misprediction depending on...
    Categories: | |
  • A CISC decoder is typically set up as a state machine. The machine reads the opcode field to determine what type of instruction it is, and where the other data values are. The instruction word is read in piece by piece, and decisions are made at each stage as to how the remainder of the instruction word will be read. One method to alleviate this is to use a decoded...
    Categories: | |
  • The goal of this project is the development of an autonomous cyber protection chip for computer systems and communication channels linked to the cloud. Background: Current technology drives the accelerated development of computer components with increasing processing capabilities, bandwidth and high level of connectivity between components that maintain a constant link to the cloud. Such systems present a significant challenge in protecting the proper operation of the components. The purpose...
    Categories: | |
  • A systolyic array is an homogenous array of identical processors each performing the same function and each connected to several neighbours. Such a structure is very suitable for fast and efficient implementation of machine learning algorithms. The goal of this project is to design and implement an architecture for the computation of the convolution stage of a neural network for deep learning.
  • Stochastic Computing (SC), which uses a bit-stream to represent a number within [-1, 1] by counting the number of ones in the bit-stream, has high potential for implementing CNNs with ultra-low hardware footprint. Since multiplications and additions can be calculated using AND gates and multiplexers in SC, significant reductions in power (energy) and hardware footprint can be achieved compared to the conventional binary arithmetic implementations. In this project we will...
    Categories: | |
  • An advanced scalable hardware accelerator for mini batch gradient descent, targets deep-learning applications. Deep neural networks are being widely used in a large number of applications for analyzing and extracting useful information from large amount of data that is being generated every day. Inference and training are the two modes of operation of a neural network. Training is the most computationally challenging task as it involves solving a large-scale optimization...
    Categories: | |
  • Memristors are resistive devices with varying resistance which depends on the voltage applied to the device. The most natural memristor application is memory. However memristors can also be used for other applications, for example logic circuits. Once such approach is MRL (Memristor Ratioed Logic) - a hybrid CMOS-memristive logic family. In MRL, OR and AND logic gates are designed using memristors. The limitation of MRL is that every memristor-based logic...
    Categories: | |
  • An advanced Global-Navigation-Satellite-System (GNSS) accelerator, which provides the end user with improved position, velocity and time solutions. High performance conventional GPS/GNSS receivers rely on ASIC technology to implement massive correlators, as the performance of SDR solutions is still limited. With a reasonable distribution of tasks between the host hardware and reconfigurable peripherals, a higher performance is achieved. The figure illustrates a schematic structure of a GNSS receiver, where the proposed...
    Categories: | |
  • מערכות הסוחרות באופן אוטומטי בניירות ערך שינו מן היסוד את פעילות שוק ההון בשנים האחרונות. רוב המסחר בבורסות האמריקאיות מתנהל כיום ללא כל מעורבות אנושית. מכונות המסחר יכולות להיות מתוכננות לסחור במניות, אופציות, חוזים עתידיים ומוצרי מט"ח המבוססים על אוסף של כללים מוגדר מראש  הקובעים מתי לקנות, מתי למכור וכמה כסף להשקיע בכל מוצר מסחר. מערכות המסחר האוטומטיות הולכות ומשתכללות תוך עיבוד נתונים בכמות ובקצב הולכים וגדלים יחד עם קיצור...
    Categories: | |
  • As the manufacturing technologies of VLSI progresses, HW architects are constantly looking for ways to improve overall performance of the CPU. In the past, many small scale architecture improvements, as well as pipelines, and other methods were used to improve performance. Other methods were increasing clock frequency and the width of data-bus, from 16 bit to 32, 64 and higher. As the manufacturing processes become more and more dense,  and...
    Categories: | |
  • The RSA algorithm stood out among asymmetric encryption systems as a conceptually simple and practical encryption and authentication method which provides a near perfect level of security. Public-key cryptographic systems, such as the RSA often involve modular exponentiation (Z = Ye mod n). This widely used and computational complex operation is  performed using successive modular multiplications (C = AB mod n). The performance of such cryptosystems is primarily determined by...
    Categories: | |