Explore chapters and articles related to this topic
Big Data Computing and Graph Databases
Published in Vivek Kale, Agile Network Businesses, 2017
A variety of system architectures have been implemented for big data and large-scale data analysis applications, including parallel and distributed relational database management systems, which have been available to run on shared-nothing clusters of processing nodes for more than two decades. These include database systems from Teradata, Netezza, Vertica, Exadata/Oracle, and others, which provide high-performance parallel database platforms. Although these systems have the ability to run parallel applications and queries expressed in the SQL language, they are typically not general-purpose processing platforms and usually run as a back end to a separate front-end application processing system.
Big Data Computing
Published in Vivek Kale, Digital Transformation of Enterprise Architecture, 2019
A variety of system architectures have been implemented for big data and large-scale data analysis applications including parallel and distributed relational database management systems that have been available to run on shared nothing clusters of processing nodes for more than two decades. These include database systems from Teradata, Netezza, Vertica, and Exadata/Oracle and others that provide high-performance parallel database platforms. Although these systems have the ability to run parallel applications and queries expressed in the SQL language, they are typically not general-purpose processing platforms and usually run as a back-end to a separate front-end application processing system.
Big Data Computing
Published in Vivek Kale, Parallel Computing Architectures and APIs, 2019
A variety of system architectures have been implemented for big data and large-scale data analysis applications, including parallel and distributed relational database management systems that have been available to run on shared-nothing clusters of processing nodes for more than two decades. These include database systems from Teradata, Netezza, Vertica, Exadata/Oracle, and others that provide high-performance parallel database platforms. Although these systems have the ability to run parallel applications and queries expressed in the SQL language, they are typically not general-purpose processing platforms, and usually run as a back end to a separate front-end application processing system.
EUROCORR 2020: ‘Closing the gap between industry and academia in corrosion science and prediction’
Published in Corrosion Engineering, Science and Technology, 2021
D. J. Mills, D. Nuttall, L. Atkin
T. Marshall (University of Surrey, UK) presented, ‘Development of a computational model for cast iron pipes’. Buried cast iron pipework is susceptible to graphitic corrosion, leaving a weakened iron oxide matrix with degraded graphite flakes. In-situ NDE methods are difficult to implement widely for buried assets. Consequently, predictive methods and models are required to predict current and future pipework conditions. An ideal solution would be to have in situ probes, fed into a corrosion model to predict future condition after x years to give probability of failure. A parallel database might predict the current condition of unknown pipes. A novel three-dimensional model using a set of cells, representing an area or volume of space, each cell being assigned a set of variables, was used. With each step the cells interact with their neighbours. Results show that this is an improvement compared with methods used over many decades.
Distributed outlier detection in hierarchically structured datasets with mixed attributes
Published in Quality Technology & Quantitative Management, 2020
We implement the outlier detection algorithm in a distributed fashion using the MapReduce programming model and the Hadoop infrastructure. MapReduce (Dean & Ghemawat, 2008) is a programming model and an associated implementation for processing and generating large datasets. Users design a MapReduce program through two functions: map and reduce. As shown in Figure 4, the users specify a map function that processes a key-value pair to generate a set of intermediate key-value pairs, and a reduce function that merges all of the intermediate values that are associated with the same intermediate key. Hadoop (Abouzeid, Bajda-Pawlikowski, Abadi, Silberschatz, & Rasin, 2009) is an open source distributed infrastructure for the MapReduce implementation. It consists of two layers: a data storage layer called the Hadoop Distributed File System (HDFS) and a data processing layer (or MapReduce Framework). In the hybrid system of Hadoop, the advanced properties of MapReduce can be combined with the performance of parallel database systems.
Machine translation model for effective translation of Hindi poetries into English
Published in Journal of Experimental & Theoretical Artificial Intelligence, 2022
Rajesh Kumar Chakrawarti, Jayshri Bansal, Pratosh Bansal
SMT is a data concerned with the statistical framework which is based on the knowledge and statistical models that are extracted from the parallel database. In this MT, a multilingual or bilingual database of languages is required. In SMT, a word is translated giving to the probability distribution function which is indicated by. The best translation is done by selecting the highest probability calculated as follows,