High-throughput computing

High-throughput computing (HTC) is a computational problem-solving approach that involves scheduling a large number of independent tasks on multiple processors, often from idle workstations, with the aim of utilizing unused processor cycles. The computations are done independently, and the network interconnect is primarily used for data distribution and results aggregation. Grid computing is an example of HTC.From: Remotely Sensed Data Characterization, Classification, and Accuracies [2019], Chemical Technology and Informatics in Chemistry with Applications [2019]

The Art of In-Memory Computing for Big Data Processing

View Chapter

Purchase Book

Published in Kuan-Ching Li, Hai Jiang, Albert Y. Zomaya, Big Data Management and Processing, 2017

Mihaela-Andreea Vasile, Florin Pop

The concept of many-task computing (MTC) has been introduced in [9] and was subject to extensive research. It has been defined as a combination of high-performance computing (HPC) and high-throughput computing (HTC) and it refers to applications that require the interaction with large data sets and generate a very large number of various tasks: independent/flows, small/large, and computational/data intensive. MTC may be classified using the number of tasks and the data sets size: big data (very large data sets and very high number of tasks), MapReduce (very large data sets and relatively reduced number of tasks), or HTC (smaller data sets and very high number of tasks). So, the big data may be considered as a data-intensive subset of MTC.

Nature-inspired cost optimisation for enterprise cloud systems using joint allocation of resources

View Article

Journal Information

Published in Enterprise Information Systems, 2021

Suchintan Mishra, Manmath Narayan Sahoo, Arun Kumar Sangaiah, Sambit Bakshi

Csorba, Meling, and Heegaard (2010) propose a VM replica mapping algorithm based on ACO that optimises cost in cloud datacentres. Cross-entropy based Ant Systems (CEAS) are deployed to find optimal replica allocation that optimises cost incurred to end-users. Intra-datacentre load balancing and inter-datacentre load balancing are considered, and attention is paid to make more and more VMs available at all time. Pandey et al. (2010) present a novel PSO-based resource allocation algorithm for dependent cloud tasks. They not only consider the compute costs due to a task, but also consider the inter-host transfer costs that arise when a file transfer occurs between dependent tasks. However, certain assumptions such as knowing the output file size in advance, knowing the dependencies in advance, finding the average of resource statistics can be inefficient in an ever-changing environment like cloud. Kessaci, Melab, and Talbi (2013) give an account of a GA-based VM placement algorithm to optimise response time and cost incurred. They present a satisfaction and profit modelling of distributed cloud systems and solve the multiobjective problem using MOGA-CB, a variant of GA. Pacini, Mateos, and Garino (2015) model an ACO-based cloud scheduler that intelligently places workload on physical machines by considering the load on computing resources to minimise response time and throughput. The algorithm makes use of two persistent load databases, namely, global load table: that contains the load of all hosts and a local load table for each VM: that tracks the load on the VM. Using weights, preference is toggled between high-performance computing and high throughput computing.

High-throughput computing

Explore chapters and articles related to this topic

The Art of In-Memory Computing for Big Data Processing

Nature-inspired cost optimisation for enterprise cloud systems using joint allocation of resources