The goal of this project is to familiarize a future VLSI designer with a variety of cutting edge parallel programming techniques while working with an advanced massively-parallel in-memory computer.

The goal of this project is to familiarize a future VLSI designer with a variety of cutting edge parallel programming techniques while working with an advanced massively-parallel in-memory computer.
Machine learning, data mining, network routing, search engines and other big data applications can be significantly sped up by massively parallel SIMD machines. However data transfer between processing units (PUs) and memory significantly limits the performance of conventional SIMD architectures. Hence one limitation of the conventional SIMD’s scalability is the off-chip memory bandwidth.
Under optimal operating conditions (when most of the time is spent on computation rather than on data transfer), arrays of computing elements in SIMD processors are very active, resulting in high power density and hotspots, creating additional design constraints such as heat dissipation, power delivery and high leakage power. Hot spots and irregular thermal density is the other limitation of the conventional SIMD’s scalability.
GP-SIMD processor can be a viable alternative to conventional SIMD processors, capable of performing a wide range of parallel data processing tasks. GP-SIMD provide both massively parallel in-memory computing capabilities and data storage at the same time. GP-SIMD processor comprises a modified RAM, facilitating processing in addition to random access. As such, GP-SIMD processor offers a regular thermal density and lower peak temperatures, enabling multilayer processor stacking and 3D DRAM integration.
GP-SIMD processor allows:
• Improve thermal density, reduce peak temperature, eliminate or considerably reduce hot spots.
• Combine data storage and data processing;
• Reduce performance degradation caused by massive data transfers between SIMD processing units and memory;
• Eliminate data transfer related energy waste.
Requirements :
In this project, you will implement an algorithm for inverting very large matrices. This project comprises interesting research and includes :
– Performing the research to understand existing methods and algorithms.
– Developing a suitable matrix inversion algorithm and proving that the algorithm is better (performance, power)
– Implementing a whole new algorithm for matrix multiplication on a highly parallel computer as shown above.
– Demonstrating the algorithm on a hardware model.
You will become familiar with the state of the art design flow including:
• Familiarity with massively parallel programming
• Implementation of machine learning algorithms (Matlab, C or C++)
• Performance and speedup analysis
Prerequisites: Logic Design
For more information, please contact Goel Samuel Room 711 Mayer Building, tel 4668, goel@ee.technion.ac.il
To view the VLSI projects classified according to different VLSI areas, see VLSI lab site :
http://webee.technion.ac.il/vlsi/Info/Projects/Projects_Projects_List_Main.html