IEEE 754 – Knowledge and References

Explore chapters and articles related to this topic

Multiplier Design Based on DBNS

Published in Vassil Dimitrov, Graham Jullien, Roberto Muscedere, Multiple-Base Number System, 2017

Vassil Dimitrov, Graham Jullien, Roberto Muscedere

One particularly interesting application is the possibility to use our multiplier in floating point operations. The floating point number systems used in practice typically represent numbers as described in the IEEE Standard for Floating-Point Arithmetic (IEEE 754) [41], which includes 32-bit (single precision), 64-bit (double precision), and 128-bit (quadraple precision) versions. In all of them, one bit signifies the sign. The exponent is represented with 8, 11, or 15 bits and the fraction is given by 23, 52, or 112 bits for single, double, and quadruple precision, respectively [41]. A floating point multiplication requires a multiplication of the fractions, e.g., a 52 × 52-bit multiplication for double precision, and consequently, a floating point processor must have support for multiplications with large operands. Clearly, the widths used, at least in the double and quadruple precision formats, exceed the threshold, when our multipliers become superior, and could therefore benefit from the results presented in this chapter.

Basics of the central processing unit

View Chapter

Purchase Book

Published in Joseph D. Dumas, Computer Architecture, 2016

Joseph D. Dumas

The IEEE 754 floating-point standard is not just a specification for a single floating-point format to be used by all systems. Rather, its designers recognized the need for different formats for different applications; they specified both single and double precision floating-point data formats along with rules for performing arithmetic operations (compliant systems must obtain the same results, bit for bit, as any other system implementing the standard) and several rounding modes. The standard also defines representations for infinite values and methods for handling exceptional cases such as overflows, underflows, division by zero, and so on. IEEE 754 is not merely a floating-point number format, but a comprehensive standard for representing real numbers in binary form and performing arithmetic operations on them.

S

View Chapter

Purchase Book

Published in Philip A. Laplante, Comprehensive Dictionary of Electrical Engineering, 2018

Philip A. Laplante

signal variance network access nodes such as switches and SCPs. signal variance See variance. signed-digit representation a fixed-radix number system in which each digit has a sign (positive or negative). In a binary signed-digit representation, each digit can assume one of the values -1, 0 and 1. significand the mantissa portion of a floatingpoint number in the IEEE 754 floating-point standard. It consists of an implicit or explicit leading integer bit and a fraction. signum function the function (2) a analysis of the signature to extract the desired (signal) information.

Accuracy Improvements for Single Precision Implementations of the SPH Method

View Article

Journal Information

Published in International Journal of Computational Fluid Dynamics, 2020

Elie Saikali, Giuseppe Bilotta, Alexis Hérault, Vito Zago

The technical standard of floating-point computation (IEEE-754) has been quickly adopted by all manufacturers after its establishment in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). This marked an important revolution in the life of programming, and in particular, scientific computing, since it stopped the chaotic and frequently unreliable treatment and representation of floating-point values throughout the computer architectures (American National Standards Institute 1985). This standard defines a common structure in the representation of the floating-point values with different levels of precision that can be achieved by similarly coded formats.

Mathematically rigorous global optimization in floating-point arithmetic

View Article

Journal Information

Published in Optimization Methods and Software, 2018

Siegfried M. Rump

For , the result of a directed rounding is the unique largest/smallest floating-point number being less/greater than or equal to x, respectively: The IEEE 754 standard defines arithmetic operations with directed rounding, that is, for , both and are computable. The corresponding INTLAB executable code is

Optimised Floating Point FFT Core for Improved OMP CS System

View Article

Journal Information

Published in International Journal of Electronics, 2022

Alahari Radhika, K. Satya Prasad, K. Kishan Rao

IEEE developed an IEEE-754 standard, for binary-floating point numbers, to improve floating point numbers’ portability. Both specific formats are single and double precision. The floating point standard IEEE-754 is the number of the floating point representation as,