Instruction pipeline – Knowledge and References

Explore chapters and articles related to this topic

I

Published in Philip A. Laplante, Comprehensive Dictionary of Electrical Engineering, 2018

instruction of electrical engineers and computer scientists. The world's largest professional organization for engineers. instruction specification of a collection of operations that may be treated as an atomic entity with a guarantee of no dependencies between these operations. instruction access fault a fault, signaled in the processor, related to abnormal instruction fetches. instruction cache See code cache. instruction format the specification of the number and size of all possible instruction fields in an instruction-set architecture. instruction issue the sending of an instruction to functional units for execution. instruction pipeline a structure that breaks the execution of instruction up into multiple phases, and executes separate instructions in each phase simultaneously. instruction pointer another name for program counter, the processor register holding the address of the next instruction to be executed. instruction pool in modern CPU implementations, a holding area in which instructions that have been fetched by an instruction fetch unit await access to an execution unit. instruction prefix a field within a program instruction word used for some special purpose. Found only rarely. The Intel X86 architecture occasionally uses instruction prefixes to override certain CPU addressing conventions. instruction reordering a technique in which the CPU executes instructions in an order different from that specified by the program, with the purpose of increasing the overall execution speed of the CPU. instruction repertoire See instruction set.

Application-Specific Instruction Set Processors for Video Processing

View Chapter

Purchase Book

Published in Ling Guan, Yifeng He, Sun-Yuan Kung, Multimedia Image and Video Processing, 2012

Sung Dae Kim, Myung Hoon Sunwoo

PCU consists of a prefetch logic, a program counter, an instruction register, an FSM, a stack, and an interrupt controller. DPU consists of two multiply and accumulate (MAC) units for two 16-bit-by-16-bit multiplications and accumulations, two arithmetic logic units (ALU), a barrel shifter, and a register file. AU has two address generation units (AGU) for load and store. Each internal word length is 32 bits. The instruction pipeline consists of six stages, that is, prefetch, fetch, decode, execute1, execute2, and execute3. VSIP has 35 arithmetic instructions, 11 logical and shift instructions, 6 program control instructions, 4 move instructions, and 16 special instructions, including instructions for H.264/AVC, which are described next.

Pipeline Architecture

View Chapter

Purchase Book

Published in Pranabananda Chakraborty, Computer Organisation and Architecture, 2020

Pranabananda Chakraborty

An instruction pipelineoperates on a stream of instructions by overlapping and decomposing the fetch, decode, and execute phases of the instruction cycle. It was used first in IBM 7030 in some form and later in a few other computers of the decade of the ’60s. It once again re-emerged in the 1980s and was massively used in RISC machines, as one of the major contributors towards achieving RISC’s high performance. It has also been widely used in many high-end mainframe CISC machines and in contemporary microprocessor-based small systems like Intel 80X86 series as well as in Motorola 68000 series.

Arbogast: Higher order automatic differentiation for special functions with Modular C

View Article

Journal Information

Published in Optimization Methods and Software, 2018

Isabelle Charpentier, Jens Gustedt

Much care has been taken to implement these primitives such that they perform very efficiently on modern architectures. In fact, modern CPUs are complex devices that provide a lot of low-level parallelism. For instance, most architectures nowadays allow indirect addressing of assembler operands. Thus one assembler addition can perform an address operation, a load or a store and the addition itself. That does not mean that all of this is done in one CPU cycle, but that each CPU cycle can start an instruction pipeline for such a complex operation. Even more parallelism can be achieved with vector units as they can be found in most commodity hardware. In particular, modern Intel and AMD CPUs provide SIMD units (called SSE or AVX) that are able to work on 4 float or 2 double values simultaneously in one instruction. These vector units also have special fast fused multiply-add (fma) instructions that in the case of float can perform 4 such operations, that is 8, FLOPs, in a single CPU instruction.4