In-memory processing

Big Data Stream Processing

Published in Vivek Kale, Parallel Computing Architectures and APIs, 2019

Spark extends its predecessors (such as MapReduce) with in-memory processing. An RDD enables developers to materialize any point in a processing pipeline into memory across the cluster, meaning that future steps that want to deal with the same dataset need not recompute it or reload it from disk. This capability opens up use cases that distributed processing engines could not previously approach. Spark is well suited for highly iterative algorithms that require multiple passes over a dataset, as well as reactive applications that quickly respond to user queries by scanning large in-memory datasets.

Experiences with big data: Accounts from a data scientist’s perspective

View Article

Journal Information

Published in Quality Engineering, 2020

Murat Kulahci, Flavia Dalia Frumosu, Abdul Rauf Khan, Georg Ørnskov Rønsch, Max Peter Spooner

With increased accumulation of production data, one of the biggest challenges has become the allocation of enough computational resources to process it. Although new technologies, such as parallel computing and quantum computing have revolutionized the whole field, memory capabilities are still limited. Most of the well-known data analytics methods worked on the principal of in-memory processing. Computing frameworks such as Hadoop and Spark (Zaharia et al. 2010) enable in-memory computation of large data streams and provide solutions to the problems prompted by the continuous streams of data (Agneeswaran 2014). In terms of data storage, there is currently a transition towards NoSQL (“non-SQL” or “non-relational”) databases (Leavitt 2010) as opposed to the traditionally structured relational databases. One of the key advantages of NoSQL databases is that they can handle large unstructured data efficiently.

In-memory processing

Explore chapters and articles related to this topic

Big Data Stream Processing

Experiences with big data: Accounts from a data scientist’s perspective