Explore chapters and articles related to this topic
PMD
Published in Heqing Zhu, Data Plane Development Kit (DPDK), 2020
Per its name, SIMD is single-instruction and multiple data access. The Intel® SSE instruction is based on 128-bit register, whereas AVX (advanced vector extensions) provides a 256-bit register and AVX512 provides a 512-bit register. From the register width perspective, SSE register can make full use of the SNB’s bandwidth; while HSW is designed with AVX register, this is the optimal use from the data bandwidth-only perspective. Similar theory applies to AVX512 registers on Skylake and Cascade lake processors today.
Parallel and high-performance systems
Published in Joseph D. Dumas, Computer Architecture, 2016
All the SIMD systems just mentioned were high-performance machines directed at the scientific processing market. Like vector processors, these large-scale array processors did not have a place in the general computing arena, so only a relatively few copies of each system were built and sold. However, SIMD technology has more recently found its way into personal computers and other less expensive systems—particularly those used for multimedia applications, signal and image processing, and high-speed graphics. The most ubiquitous example of (very coarse-grained) SIMD in today’s computers is in the multimedia extensions that have been added to the original Intel x86 architecture. Intel’s MMX enhancements, introduced in 1997, were executed by a simple array coprocessor integrated onto the chip with the CPU and floating-point unit. These new MMX instructions were capable of performing arithmetic operations on up to eight integer values at once. Advanced Micro Devices (AMD), manufacturer of x86-compatible processors, responded the following year by introducing its 3DNow! instruction set architecture, which was a superset of MMX, including floating-point operations. Further SIMD operations were later added by Intel with streaming SIMD extensions (SSE), SSE2, SSE3, SSSE3, and SSE4. AMD countered with Enhanced 3DNow! and later included full SSE support in processors supporting Professional 3DNow! before announcing in 2010 that it was discontinuing 3DNow! support in favor of Intel compatibility. Newer processors from both Intel and AMD support advanced vector extensions (AVX). These architectural enhancements, along with others, such as SPARC’s Visual Instruction Set (VIS); the MIPS MDMX, MIPS-3D, and MXU multimedia instructions; the AltiVec extensions to the PowerPC architecture; and the NEON advanced SIMD extensions to the ARM Cortex-A8 processors, all make use of parallel hardware to perform the same operation on multiple data values simultaneously and thus embody the SIMD concept as they enhance performance on the multimedia and graphics applications being run more and more on home and business systems.
High-Performance Computing for Nuclear Reactor Design and Safety Applications
Published in Nuclear Technology, 2020
Afaque Shams, Dante De Santis, Adam Padee, Piotr Wasiuk, Tobiasz Jarosiewicz, Tomasz Kwiatkowski, Sławomir Potempski
Another interesting topic regarding the scaling of NEK5000 is its efficiency in the utilization of single instruction, multiple data instructions from the advanced vector extensions (AVX) set. These instructions are designed to perform an arithmetic operation on several operands within one clock cycle. There are many limitations though, like slowdowns on memory access, lack of some operations (e.g., div/sqrt for full-length AVX), and problems with branch predication and with the power envelope of the CPU. The AVX unit is one of the biggest power consumers in the CPU, and upon activation it may slow down other CPU modules. It is therefore really difficult to predict the efficiency of a code when using shorter or longer operand vectors.