A compiler-driven supercomputer
作者:
Highlights:
•
摘要
The overall performance of supercomputers is slow compared to the speed of their underlying logic technology. This discrepancy is due to several bottlenecks: memories are slower than the CPU, conditional jumps limit the usefulness of pipelining and pre-fetching mechanisms, and functional-unit parallelism is limited by the speed of hardware scheduling. This paper describes a supercomputer architecture called Ring of Pre-fetch Elements (ROPE) that attempts to solve the problems of memory latency and conditional jumps, without hardware scheduling. ROPE consists of a very pipelined CPU data path with a new instruction pre-fetching mechanism that supports general multi-way conditional jumps. An optimizing compiler based on a global code transformation technique (Percolation Scheduling or PS) gives high performance without scheduling hardware.
论文关键词:
论文评审过程:Available online 1 April 2002.
论文官网地址:https://doi.org/10.1016/0096-3003(86)90128-1