Explore chapters and articles related to this topic
On-Chip Regulators for Low-Voltage and Portable Systems-on-Chip
Published in Fei Yuan, Krzysztof Iniewski, Low-Power Circuits for Emerging Applications in Communications, Computing, and Sensing, 2018
Efficient voltage regulation and conversion are essential mechanisms in modern integrated circuit (IC) design process due to power management and heterogeneous computing [1]. Specifically, fully monolithic on-chip voltage regulation has emerged as a critical process for a variety of low-power design methodologies, such as voltage islands (ranging from ultralow voltages in the range of 0.4–0.5 V to higher voltages in the range of 1.2–1.4 V), dynamic voltage (and frequency) scaling, low-voltage clocking, and near-threshold computing [2–4]. Furthermore, on-chip voltage regulators play a critical role to ensure sufficient power integrity since it is highly challenging to maintain power supply noise within a tolerable range when the supply voltage is low and the load current is high [5–7]. Power supply noise not only affects the timing characteristics within synchronous digital circuits, but also degrades the overall signal integrity in analog and mixed-signal circuits [8]. For example, a fully integrated voltage regulator (FIVR) was developed for the Intel Haswell microarchitecture, allowing dynamically managed multiple power domains [9].
PMD
Published in Heqing Zhu, Data Plane Development Kit (DPDK), 2020
Batch processing is a big step to achieve performance optimization. Is there more room available? From the data access perspective, Table 7.1 describes the speed and cost in different Intel® processors. Nehalem, SNB (Sandy Bridge), and HSW (Haswell) are used for Intel® Xeon processor as code name in different generations. Data access throughput is more than doubled from Nehalem to HSW. The packet processing is a heavy user of the memory access. The effective use of cache/memory bandwidth and throughput is the most important area to be studied (Table 7.1).
High-Performance Computing for Nuclear Reactor Design and Safety Applications
Published in Nuclear Technology, 2020
Afaque Shams, Dante De Santis, Adam Padee, Piotr Wasiuk, Tobiasz Jarosiewicz, Tomasz Kwiatkowski, Sławomir Potempski
The AVX2 from Haswell, apart from longer registers, also introduces several new instructions out of which especially FMA3 may be beneficial in CFD. Nevertheless, the problems mentioned above make it really hard to squeeze the theoretical performance boost out of the CPU when moving from Ivy Bridge to Haswell in most of the applications. Even in simple matrix multiplication (e.g., Linpack18) the gain rarely exceeds 80% to 90%, and in more complex workloads it is usually much less. For example, 3-D image rendering performance (measured with the Cinebench R15 benchmark) of the machines equipped with the E5-2680v2 CPUs yielded 2635 points, and with the E5-2680v3 CPUs 3260 points, only 24% more. In the SPEC FP2006 Rate synthetic benchmark, machines with dual E5-2680v2 CPUs usually get around 630 points while with E5-2680v3 CPUs around 770 points, which is about 22% more. In CFD the gain should be bigger, probably closer to simple linear algebra problems, but the exact performance gain should be measured.