GDDR6

GDDR6

GDDR6 is a type of memory that offers high system bandwidth, with a speed of up to 56 GB/s, and is commonly used in GPUs such as the NVIDIA Tesla T4 and Quadro P4000. It is a graphics double data rate technology that provides faster I/O speed than other signaling schemes, such as DDR4, with a maximum speed of 8 Gbps per pin.From: Semiconductor Memory Devices and Circuits [2022], Proposal and evaluation of adjusting resource amount for automatically offloaded applications [2022], 3D Integration in VLSI Circuits [2018]

Dynamic Random Access Memory (DRAM)

View Chapter

Purchase Book

Published in Shimeng Yu, Semiconductor Memory Devices and Circuits, 2022

Shimeng Yu

What makes HBM attractive is not only the higher integration density (multiple dies on the same 2D form factor), but also the wide I/O interface that it could offer. As aforementioned, the DDR/LPDDR often has a 64-bit-wide I/O interface, and GDDR often has a 32-bit-wide I/O interface. Now, HBM offers a 1024-bit-wide I/O interface. Even running at a slower I/O clock frequency, which means lower interface speed (Gbps) per pin, HBM could offer a significantly higher bandwidth (GB/s) at the system level. Table 3.1 summarizes the evolution of HBM interface protocol standard and the comparison with LPDDR and GDDR counterparts. As of 2020, HBM has gone through three generations (HBM, HBM2, and HBM2E). The capacity per DRAM die has increased from 2 Gb to 16 Gb, and the number of DRAM dies in the stack has increased from 4 to 8; thus, the total capacity has increased from 1 GB to 16 GB. The system bandwidth has increased from 128 GB/s to 410 GB/s. As a comparison, LPDDR5 and GDDR6 offer the system bandwidth 37.5 GB/s and 56 GB/s.

Three-Dimensional Integration: Technology and Design

View Chapter

Purchase Book

Published in Katsuyuki Sakuma, Krzysztof Iniewski, 3D Integration in VLSI Circuits, 2018

P. Franzon

3D memories do exactly this and are positioned to be the next large-volume application of 3DIC technologies. To date dynamic random-access memory (DRAM) has relied on one-signal-per-pin signaling using low cost, low pin count, and single-chip plastic packaging. As a result, DRAM has continued to lag logic in terms of bandwidth potential and power efficiency. Furthermore, the I/O speed of one-signal-per-pin signaling schemes are unlikely to scale a lot beyond what can be achieved today in double data rate (DDR4) (up to 3.2 Gbps per pin) and graphics double data rate (GDDR6) (8 Gbps). Beyond these data rates, two-pins-per-signal differential signaling is needed. Furthermore, the I/O power consumption, measured as pJ/bit, is relatively high, even for the low-power DDR (LPDDR) standards (intended for mobile applications).

Study and evaluation of automatic GPU offloading method from various language applications

View Article

Journal Information

Published in International Journal of Parallel, Emergent and Distributed Systems, 2022

Yoji Yamato

Regarding GPU, we use NVIDIA GeForce RTX 2080 Ti (CUDA core: 4352, Memory: GDDR6 11GB) and NVIDIA Quadro K5200 (CUDA core: 2304, Memory: GDDR5 8GB). We use CUDA Toolkit 10.1 for GPU control. We use PGI compiler 19.10 for C language, pyCUDA 2019.1.2 and Cupy 7.8 for Python, and JCuda 10.1 for Java. Figure 3 shows the evaluation environment and specifications. Here, the application code used by the user is specified from the client notebook PC, tuned using the verification machine, and then deployed to the running environment for the actual use.

Explore chapters and articles related to this topic

Dynamic Random Access Memory (DRAM)

Three-Dimensional Integration: Technology and Design

Study and evaluation of automatic GPU offloading method from various language applications