Pointer-Based Instruction Queue for Out-of-Order Processors

Background: Out-of-order processors issue instructions even before their source operands are available. The processor component central to out-of-order processing is the Instruction Queue (IQ). Its performance is critical to overall processor speed and power consumption. For instance, the issue logic consumes near 25% of CPU power in. The IQ operates as follows: First, instructions are entered (or allocated) into the IQ where they wait for their operands. An instruction is ready to issue to its execution unit when its operands are ready. This is detected using wakeup and select logic. Selected instructions are issued to execution units next. Allocation is the process of writing the necessary information into the IQ memory. The wakeup logic is responsible for detecting when an instruction operand is ready. An instruction is marked ready when all of its operands are ready. The select logic chooses for execution a subset of instructions marked ready by the wakeup logic. A part of the IQ memory, payload RAM, is read next to issue the instruction. For instructions with one-cycle execution latency, wakeup and select logic have to have a latency of one cycle to avoid IPC loss. Two types of wakeup logic have been used in modern processors: a CAM-based IQ logic and a dependency-matrix based logic. The latter was described in an Intel patent, while the former was used in the MIPS R10K processor. Neither of these approaches is scalable with respect to instruction queue size and issue width. Another approach, pointer based wakeup, has been proposed but never implemented. It stores a pointer(s) to dependent instruction(s) in each instruction queue entry. The wakeup logic does not use CAM's, instead a pointer is used to access the IQ and set source operand Ready bits. The mechanism provides space for one or more dependent pointers in each instruction. In the past, the pointer-based IQ design proposals had difficulties dealing with multiple successors and branch mis-prediction recovery. Technology: University researchers have developed a different approach to pointer-based wakeup which overcomes these problems and allows large IQ implementations while maintaining a 1-cycle wakeup-select cycle. It is called a "direct wakeup" mechanism. It solves the problem of multiple dependents by using a small number of full dependency vectors. This almost completely eliminates stalls for instructions with multiple successors while requiring fewer resources than the full dependency matrix. CAMs are not used at all, saving power and improving scalability. The second major problem, correct recovery of dependent pointers on branch mis-prediction, is also solved. The approach check-points a small amount of additional information on each conditional branch using a destination physical register tag as a unique identifier of an instruction. Application: This invention can be used for designing better out-of-order processors, with low power consumption, better instruction queue scalability, and faster clock speed.

Type of Offer: Licensing



Next Patent »
« More Computer Science Patents

Share on      


CrowdSell Your Patent