AVX-512

AVX-512

PMD

Published in Heqing Zhu, Data Plane Development Kit (DPDK), 2020

Per its name, SIMD is single-instruction and multiple data access. The Intel® SSE instruction is based on 128-bit register, whereas AVX (advanced vector extensions) provides a 256-bit register and AVX512 provides a 512-bit register. From the register width perspective, SSE register can make full use of the SNB’s bandwidth; while HSW is designed with AVX register, this is the optimal use from the data bandwidth-only perspective. Similar theory applies to AVX512 registers on Skylake and Cascade lake processors today.

Design of a modern fast Fourier transform and cache effective bit-reversal algorithm

View Article

Journal Information

Published in International Journal of Parallel, Emergent and Distributed Systems, 2023

Adam Simek, Ivan Šimeček

Benchmarks were designed with the help of FFTW benchmark methodology [16] and all FFT compared have very similar usage conditions, with the planning part not counted and then compared running times of FFT parts and converted time results to MFLOPS. Tests were performed on a unit with Intel Xeon Gold 6130 Processor, which provides AVX2 as well as AVX-512. All libraries tested had at least AVX2 enabled and everything was compiled with GCC compiler with -O3 optimizations and vectorization enabled, all tests are measured in double precision, which means AVX uses vector size 4 and AVX-512 vector size 8. OpenMP tests were computed on Intel Xeon Scalable Gold 6146 providing 24 () cores in the dual sockets, with AVX2 codes and -O3 optimizations.

Explore chapters and articles related to this topic

PMD

Design of a modern fast Fourier transform and cache effective bit-reversal algorithm