Explore chapters and articles related to this topic
The Art of In-Memory Computing for Big Data Processing
Published in Kuan-Ching Li, Hai Jiang, Albert Y. Zomaya, Big Data Management and Processing, 2017
Mihaela-Andreea Vasile, Florin Pop
The concept of many-task computing (MTC) has been introduced in [9] and was subject to extensive research. It has been defined as a combination of high-performance computing (HPC) and high-throughput computing (HTC) and it refers to applications that require the interaction with large data sets and generate a very large number of various tasks: independent/flows, small/large, and computational/data intensive. MTC may be classified using the number of tasks and the data sets size: big data (very large data sets and very high number of tasks), MapReduce (very large data sets and relatively reduced number of tasks), or HTC (smaller data sets and very high number of tasks). So, the big data may be considered as a data-intensive subset of MTC.
Nature-inspired cost optimisation for enterprise cloud systems using joint allocation of resources
Published in Enterprise Information Systems, 2021
Suchintan Mishra, Manmath Narayan Sahoo, Arun Kumar Sangaiah, Sambit Bakshi
Csorba, Meling, and Heegaard (2010) propose a VM replica mapping algorithm based on ACO that optimises cost in cloud datacentres. Cross-entropy based Ant Systems (CEAS) are deployed to find optimal replica allocation that optimises cost incurred to end-users. Intra-datacentre load balancing and inter-datacentre load balancing are considered, and attention is paid to make more and more VMs available at all time. Pandey et al. (2010) present a novel PSO-based resource allocation algorithm for dependent cloud tasks. They not only consider the compute costs due to a task, but also consider the inter-host transfer costs that arise when a file transfer occurs between dependent tasks. However, certain assumptions such as knowing the output file size in advance, knowing the dependencies in advance, finding the average of resource statistics can be inefficient in an ever-changing environment like cloud. Kessaci, Melab, and Talbi (2013) give an account of a GA-based VM placement algorithm to optimise response time and cost incurred. They present a satisfaction and profit modelling of distributed cloud systems and solve the multiobjective problem using MOGA-CB, a variant of GA. Pacini, Mateos, and Garino (2015) model an ACO-based cloud scheduler that intelligently places workload on physical machines by considering the load on computing resources to minimise response time and throughput. The algorithm makes use of two persistent load databases, namely, global load table: that contains the load of all hosts and a local load table for each VM: that tracks the load on the VM. Using weights, preference is toggled between high-performance computing and high throughput computing.