Memory hierarchy – Knowledge and References

Explore chapters and articles related to this topic

Memory Organisation

Published in Pranabananda Chakraborty, Computer Organisation and Architecture, 2020

Technology-wise, the three key characteristics of memory, namely, capacity, speed (access time), and cost are considered as the trade-off. From an application point of view, large capacity memory with low cost per bit is an essential requirement. But, to meet performance requirement, an expensive, relatively lower capacity memories with fast access time are also essential. Thus smaller, faster, and more expensive memories are supplemented by larger, slower, and cheaper memories. To accommodate this diverse requirement at the time of design, a compromise is made which dictates that not a single memory component or technology would be used, rather a memory hierarchy is to be employed for cost-effective performance. The ultimate target of creating a memory hierarchy with different types of memories at different levels is to support the CPU to work nearly at its optimal speed during the execution of programs, of course, obeying the trade-off parameters as already mentioned.

Introduction to Algorithms and Data Structures

View Chapter

Purchase Book

Published in Sriraman Sridharan, R. Balakrishnan, Foundations of Discrete Mathematics with Algorithms and Programming, 2019

Sriraman Sridharan, R. Balakrishnan

Hierarchy of memories: In fact there are four levels of memory hierarchy:RegistersCacheRandom access main memory (RAM)Secondary or disk memoryEach level has a larger storage capacity than the preceding level but access time is greater than the preceding one. As we have already seen, registers are the processors’ own storage elements. Access to the register is extremely fast. Access to the register is faster than access to the main memory.

Performance Metrics and Software Architecture

View Chapter

Purchase Book

Published in David R. Martinez, Robert A. Bond, Vai M. Michael, High Performance Embedded Computing Handbook, 2018

Jeremy Kepner, Theresa Meuse, Glenn E. Schrader

The canonical parallel computer architecture consists of a number of nodes connected by a network. Each node consists of a processor and memory. The memory on the node may be further decomposed into various levels of cache, main memory, and disk storage (see Figure 15-9). There are many variations on this theme. A node may have multiple processors sharing various levels of cache, memory, and disk storage. In addition, there may be external global memory or disk storage that is visible to all nodes. An important implication of this canonical architecture is the “memory hierarchy” (Figure 15-9). The memory hierarchy concept refers to the fact that the memory “closer” to the processor is much faster (the bandwidth is higher and latency is lower). However, the price for this performance is capacity. The canonical ranking of the hierarchy is typically as follows: Processor registersCache L1, L2, …Local main memoryRemote main memoryLocal diskRemote disk

Exploration for Software Mitigation to Spectre Attacks of Poisoning Indirect Branches

View Article

Journal Information

Published in IETE Technical Review, 2018

Baozi Chen, Qingbo Wu, Yusong Tan, Liu Yang, Peng Zou

Modern processors use cache to fill up the speed gap in memory hierarchy. At the same time, it introduces uncertainty to the system that time of memory accesses varies depending on whether the data are in cache or not. Cache timing attacks are a specific type of side-channel attack that exploit the effects of the cache memory on the execution time of algorithms. The attacker can determine the addresses allocated into cache by measuring the time taken to access entries and leak information. There are several techniques to exploit cache that have been demonstrated already. Prime+Probe [15–17] is the one that the attacker fills one or more cache lines with its own contents, waits for the victim to execute and then probes by timing accesses to preloaded cache lines. If the attacker observes remarkable increased memory access latency, it means that the cache lines have been evicted by the victim who has touched an address that maps to the same set. Flush+Reload [18] is contrast to Prime+Probe. The attacker first flushes targeted cache lines, waits for the victim to execute and then reloads the flushed cache line by touching it, in the meanwhile measuring the time taken. If the attacker observes a fast memory access, it means that the cache lines have been reloaded by the victim. Evict+Time [19] compares the overall execution time of the victim after evicting some cache lines of interest with a baseline. The variation of overall execution time is then used to deduce whether the lines of interested have been accessed by the victim.

Synthesis of programmable biological central processing system

View Article

Journal Information

Published in Journal of the Chinese Institute of Engineers, 2021

Wei-Xian Li, Jiangfeng Cheng, Chun-Liang Lin, Chia-Feng Juang

In a typical CPU, registers are circuits composed of flip-flops, often with characteristics similar to memory. Typical registers can temporarily store address data, instruction data, and calculated data, and accelerate execution of the CPU without the need for fetching data from external memory. Registers are the top level in the memory hierarchy and can rapidly process data for the system. We constructed several types of biological registers to expedite data processing for the Bio-CPU. Registers in electronic circuits are commonly realized by D flip-flops (Lin, Kuo, and Chen 2015). In the proposed Bio-CPU, three types of biological D flip-flops are considered, namely the biological D flip-flop, the clock enable (CE) D flip-flop, and the Master-Slave D flip-flop.