FLOPS – Knowledge and References

Explore chapters and articles related to this topic

High-Performance Computing and Its Requirements in Deep Learning

Published in Sanjay Saxena, Sudip Paul, High-Performance Medical Image Processing, 2022

Biswajit Jena, Gopal Krishna Nayak, Sanjay Saxena

We use supercomputers in the case of high-performance computing compared to general-purpose computers and exploit the features, properties, and characteristics of supercomputers. The performance of general-purpose computers is measured in MIPS, which stands for million instructions per second. With supercomputers, performance is instead measured in FLOPS, which stands for floating-point operations per second. A supercomputer is defined as one that can far outperform general-purpose computers in terms of speed, reliability, efficiency, and problem-solving capacity. There are supercomputers that have the capability to perform up to around a hundred quadrillion FLOPS, with the majority of the fastest using Linux as their operating system. A high-performance computing system does not necessarily contain any components that you would not find in a general-purpose computer. The difference is mainly in quantity, as HPCs are composed of computing clusters configured to work together. Whereas a general-purpose computer typically contains a single processor, a supercomputer contains several processors, each comprised of anywhere between two and four cores. Each individual computer within an HPC cluster is referred to as a node, so a supercomputer with 64 nodes may have up to 256 cores, all working in tandem. When high numbers of individual nodes work efficiently together, they can often solve problems that would be too complex for a single computer to solve by itself.

High-Performance Computing for Fluid Flow and Heat Transfer

View Chapter

Purchase Book

Published in W.J. Minkowycz, E.M. Sparrow, Advances in Numerical Heat Transfer, 2018

D.W. Pepper, J.M. Lombardo

Flops is the common performance unit used to define the speed of a computer in terms of the number of floating-point operations it can perform in one second. However, not all floating-point operations require the same number of clock cycles – a floating-point division may take 4–20 times as many cycles as a floating-point addition. The flop is an especially popular measure of machine performance on scientific and mathematical programs, but it is not a reasonable measure for programs or benchmarks using few or no floating-point operations. Peak (theoretical) performance is the maximum number of flops – but is almost never achieved. Table 2 lists several supercomputers and their theoretical peak performances in Mflops, including their maximum number of processors [1, 8].

Computers in communications

View Chapter

Purchase Book

Published in Geoff Lewis, Communications Technology Handbook, 2013

Geoff Lewis

Benchmark programs. These are test programs designed to evaluate and compare the speed performance of various computer systems and architectures. Common, but not entirely satisfactory, are the rates at which certain operations can be carried out for a given clock frequency. These parameters are stated in mega-instructions per second (MIPS) which states the rate at which instructions are carried out, or mega-flops per second (MFLOPS) which refers to the rate at which mathematical floating point operations can be processed. The major problem with both MIPS and MFLOPS is that they are dependent upon the number of clock cycles required to process an instruction and this can vary between different processors. Typical early benchmarks included programs to test load an ASCII file of specified length, search a file of fixed length to find a specified record, sort an unindexed file into some prearranged order or to extract all the prime numbers below some upper value. Over the years the benchmarks have been considerably refined and currently a number of pseudo-standard programs are available. These include: Ackerman. A short recursive program that is suitable for initial evaluation of architectures that use the cache technique.Dhrystones. Devised by Wiecker to exercise the system utility programs. It provides a standard mix of load, store and branch instructions that tests the ability to manipulate strings of characters. The parameter is stated in Dhrystones/second.Puzzle. This is a lengthy benchmark that evaluates the system by solving a three-dimensional matrix problem. It provides a run on a good mix of instructions and procedures.Sieve. This is a small program devised to find all the prime numbers below some stated value n. It is based on the sieve of Eratosthenes and provides a recursive search of all the numbers less than n to eliminate the non-prime numbers from the list.Whetstones. Devised by Whetstone as a program to evaluate the operational speed of the processor. It consists of a mix of floating point, integer and data processing instructions that is representative of scientific programs. The parameter is stated in Whetstones/second.

Enhanced MIMO-DCT-OFDM System Using Cosine Domain Equalizer

View Article

Journal Information

Published in International Journal of Electronics, 2023

Khaled Ramadan

The complexity evaluation of several equalization and compensating algorithms is provided in this section. Real multiplication/addition/division is counted as one operation that may be performed using half flops [19]. The term “flops” refers to the number of floating-point operations performed per second. Table 2 shows the total number of operations and flops associated with various mathematical operation scenarios. Table 3 shows the total number of operation and flops associated with various full-matrix operation scenarios. Table 4 shows the total number of operation and flops associated with various banded-matrix operation scenarios. So, for each equalization described later, let’s calculate the total number of operations and flops for different configuration orders like 2×2, 4×4, 8×8, 16×16, 32×32, 64×64, and 128×128 MIMO-DCT-OFDM system.