Explore chapters and articles related to this topic
Data Lakes: A Panacea for Big Data Problems, Cyber Safety Issues, and Enterprise Security
Published in Mohiuddin Ahmed, Nour Moustafa, Abu Barkat, Paul Haskell-Dowland, Next-Generation Enterprise Security and Governance, 2022
A. N. M. Bazlur Rashid, Mohiuddin Ahmed, Abu Barkat Ullah
For data processing in a data lake, MapReduce is usually used, which is a ready-to-use parallel data processing framework given by Apache Hadoop. Because of working on disk-level, MapReduce is suitable for Big Data. However, it is less effective with fast data. Apache Spark is the alternative solution for fast data with the cost of using full memory to store intermediate results instead of using a file system. Apache Spark is, therefore, most appropriate for real-time data processing. Apache Flink and Apache Storm can also perform real-time data processing similar to Apache Spark. The combination of MapReduce and the real-time processing framework can be more suited for Big Data with stream processing.
Internet of Things and Remote Sensing
Published in Lavanya Sharma, Pradeep K Garg, From Visual Surveillance to Internet of Things, 2019
Spatial analytics in combination with the IoT can be more than just features on a map, as it deals with the data relating to the position, size, or shape of objects in 2D or 3D space. Geospatial data offers great potential for better understanding, modeling, monitoring, and visualizing the objects, using IoT as an important tool [9]. The applications of spatial analytics in the IoT are challenging because (i) large amounts of data must be extracted, stored, and processed; (ii) many sources provide heterogeneous data that are to be integrated; (iii) many applications require real-time data processing; and (iv) there is a greater need for creating visualization models.
Assessing the Performance of Human–machine Interaction in eDrilling Operations
Published in Eirik Albrechtsen, Denis Besnard, Oil and Gas, Technology and Humans, 2018
A core feature of such systems, including eDrilling is the use of real-time data processing, which enables real-time communication between onshore and offshore drilling actors and the shared supervision of drilling processes. Real-time data processing also enables simulations to be carried out. One application of this is the ability to make a diagnosis of the drilling state and conditions (for example, the temperature profile and friction). Simulations also enable the generation of a) early warnings of upcoming unwanted conditions and events; b) test drilling plans; and c) drilling scenarios.
A review on big data real-time stream processing and its scheduling techniques
Published in International Journal of Parallel, Emergent and Distributed Systems, 2020
Nicoleta Tantalaki, Stavros Souravlas, Manos Roumeliotis
A stream processing system or data stream management system (DSMS), is designed to handle data streams and manage continuous queries. It executes continuous queries that are not only once performed, but are continuously executed until they are explicitly uninstalled. It produces results as long as new data arrives in the system and data is processed on the fly without the need for storing it. Data is usually stored after processing. Stream processing systems differ from batch processing systems, due to the requirement of real-time data processing. The term ‘real-time processing system’ refers to a system that responds within ‘real-world’ time deadlines. It guarantees that a certain process will be executed within a given period, maybe a few seconds, depending on the quality of service constraints. The term ‘real-time’ is a bit redundant but many systems use the term to describe themselves as low latency systems. Elaborate and agile systems have been proposed for these new demands.
Schema on read modeling approach as a basis of big data analytics integration in EIS
Published in Enterprise Information Systems, 2018
Slađana Janković, Snežana Mladenović, Dušan Mladenović, Slavko Vesković, Draženko Glavić
For decades, there have been two main approaches to data integration, namely batch data integration and real-time data integration. Both approaches have secured a place for themselves in Big Data integration processes as well. From the data analytics perspective, Big Data systems support the following classes of applications: batch-oriented processing, stream processing, OLTP (Online Transaction Processing) and interactive ad-hoc queries and analysis (Ribeiro, Silva, and da Silva 2015). The batch data integration approach is used in batch-oriented processing applications, whereas the real-time data integration approach is used in stream processing, OLTP and interactive ad-hoc queries and analysis applications. An overview of the most important approaches and solutions in the field of Big Data integration with EISs, both in the batch as well as the real-time mode, will be given in the text below.
Applying big data and stream processing to the real estate domain
Published in Behaviour & Information Technology, 2019
Herminio García-González, Daniel Fernández-Álvarez, José Emilio Labra-Gayo, Patricia Ordóñez de Pablos
But new challenges, linked to the big amount of data, have emerged. One of them is the real-time data. Real-time data has the characteristic of changing in short periods of time, i.e. it is data that quantifies something as it happens. This kind of data is also growing due to advances in Internet of Things (IoT) (Wingerath et al. 2016) where a lot of sensors are capturing environmental data and publishing it through the Internet.