Sequential access – Knowledge and References

Explore chapters and articles related to this topic

Memory Organisation

Published in Pranabananda Chakraborty, Computer Organisation and Architecture, 2020

Magnetic tape was historically the first kind of most popular secondary memory in use for regular processing of data. Magnetic-tape memory, tape, and tape drive units are very similar to our domestic audio/video tape recorders but store binary digital information. The storage medium is a flexible mylar tape (plastic tape) coated with magnetic oxide. Information is generally stored in 9 parallel longitudinal tracks. A read–write head is used that can access 9 tracks simultaneously. Data are stored on tape one basic character (9-bit) at a time across the width of the head, and each character comprises 8 bits (one byte) and the remaining one bit is used as the parity-check bit. As with the disk, data are read and written in the tape in contiguous blocks called physical records. Adjacent blocks are separated by relatively large gaps, called inter-block gaps. Similarly, adjacent records within a block are separated by gaps, called inter-record gaps (IRGs). A tape drive is essentially a sequential-access device. This means that if the tape head is currently positioned at record 1, then to read or write physical record k, all the physical records, record 1 through k − 1 should be passed through one at a time. If the head is presently positioned beyond the target record, the tape must be rewound to bring it to the beginning of the desired record, and then start operating forward. Magnetic tapes are stored on reels of about 2,400 ft in length, and more so today as it provides a compact, inexpensive, and portable medium of storing large information of files. These tapes are also packaged in cartridges or cassettes which are analogous to audio-tape cassettes. It is still in wide use in spite of having the slowest speed but equally of lowest cost, and are generally used as back-up of large files apart from its frequent use to store regularly operated active files.

Storage and access optimization scheme based on correlation probabilities in the internet of vehicles

View Article

Journal Information

Published in Journal of Intelligent Transportation Systems, 2020

Zhou Bin, Yuhao Yao, Xiao Liu, Rongbo Zhu, Arun Kumar Sangaiah, Maode Ma

In the traditional method, only the other files in the same data block are pre-fetched, and the correlation between files is not considered. This approach is valid only for sequential access, has a very low prefetch hit rate and cannot handle random access requests. Correlations exist between files, especially in distributed file systems, since the file data have temporal and regional locality; there are also many correlations between the file contents, which will lead to a good deal of public access when the files are accessed. A cache layer between the client and the HDFS system is set up in order to store these pre-fetched files. The correlation between the files, based on the feature vector, is used to improve the hit ratio of prefetching. Based on the correlation of the files and their local accessibility, the files are merged, and the FCP table is then generated.

Integrating memory-mapping and N-dimensional hash function for fast and efficient grid-based climate data query

View Article

Journal Information

Published in Annals of GIS, 2021

Mengchao Xu, Liang Zhao, Ruixin Yang, Jingchao Yang, Dexuan Sha, Chaowei Yang

For array data structure on entity-level, the array data structure provides complexity for single data retrieval; this is the fastest way computers can achieve when retrieving something. Traditional ways involve tiling and chunking big arrays (Baumann et al. 1999; Boncz, Zukowski, and Nes 2005; Cudré-Mauroux et al. 2009). Typically, tiling and sub-setting are necessary for array database, like in Rasdaman, the size of its smallest element is from 32KB to 640KB, as the order of default page file size (4KB). However, arrays are not suitable for fragmenting because it will increase data access and search complexity. Since disks are designed to do the sequential reading, too small and too much tilling is not good for I/O performance. In 2002, Reiner developed a tree for MDD tile node searching (Reiner et al. 2002), however, since the storage rectangles are duplicated, tree increases the data volume and makes construction and maintenance more expansive. In computational complexity concern, data retrieve complexity will be . Ideally, an multidimensional array will provide complexity when retrieving data from it. However, because of tiling and sub-setting, the complexity for search increases considerably if the tiling unit is small and the whole dataset is large. Meanwhile, when a multidimensional array is stored in a secondary storage device, data files are often not sequentially stored on the device and the cost to access different byte address is different. Although the computational complexity is the same for each cell in an array, the cost of finding the index and let the disk head (in HDD) to retrieve is different. Even for SSDs, which does not have a physical head inside, random access and sequential access varies largely in speed. Since the ability to do random read/write and sequential read/write is different, storing data bytes closer to each other will have better performance for data read when they are retrieved together. Therefore, instead of chunking and tiling, putting a multidimensional array in a holonomic array form should have a performance gain for sequential data retrieval.