Fixed point – Knowledge and References

Explore chapters and articles related to this topic

Round-Off Errors and Limit Cycles in Digital Filters

Published in John T. Taylor, Qiuting Huang, CRC Handbook of ELECTRICAL FILTERS, 2020

When two fixed-point numbers are added, the result may exceed the dynamic range of the implementation; i.e., addition of fixed-point numbers may cause overflow. When two fixed-point numbers are multiplied, the result, in general, is twice the initial word length. If both numbers have an integer part, the results may overflow. If both numbers are fixed-point fractions, the result remains a fraction and cannot overflow. It is, therefore, preferable in a fixed-point implementation to scale all numbers so that they are fractions. Multiplication of two fixed-point fractions, even though free of overflow, does lead to underflow. In IIR digital filter implementations, a word-length reduction is necessary to prevent the word lengths of the signals from increasing. This reduction of the word length is called signal quantization.

Computer Arithmetic

View Chapter

Purchase Book

Published in Vojin G. Oklobdzija, Digital Design and Fabrication, 2017

Earl E. Swartzlander, Gensuke Goto

Fixed-point number systems represent numbers, for example, A by n bits: a sign bit, and n − 1 data bits. By convention, the most significant bit, an−1, is the sign bit, which is a ONE for negative numbers and a ZERO for positive numbers. The n − 1 data bits are an−2, an−3, …, a1, a0. In the two's complement fractional number system, the value of a number is the sum of n − 1 positive binary fractional bits and a sign bit that has a weight of −1: A=−an−1+∑i=0n−2ai2i−n+1 Examples of 4-bit fractional two's complement fractions are shown in Table 11.1. Two points are evident from the table: first, there is only a single representation of zero (specifically 0000) and second, the system is not symmetric (there is a negative number, −1, [1000], for which there is no positive equivalent). The latter property means that taking the absolute value of or negating a valid number (−1) can produce a result that cannot be represented.

Quest for Energy Efficiency in Digital Signal Processing

View Chapter

Purchase Book

Published in Tomasz Wojcicki, Krzysztof Iniewski, VLSI: Circuits for Emerging Applications, 2017

Ramakrishnan Venkatasubramanian

Fixed-point systems use the bits to represent a fixed range of values, either integers or with a fixed number of integer and fractional bits. The dynamic range of values is therefore quite limited and values outside the set range must be saturated to the endpoints. Fixed-point processors usually quote their 16-bit performance as multiplies per second or MAC operations per second. Algorithms developed for fixed-point processors have to operate on a set of data that stays within the predetermined range to make the optimal use of the quoted DSP performance. Because of this, any data set that is not predictable or has a wide variation will have significant performance reduction in a fixed-point DSP.

Optimized Deep Learning-Based Fully Resolution Convolution Neural Network for Breast Tumour Segmentation on Field Programmable Gate Array

View Article

Journal Information

Published in Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 2023

Sharada Guptha M N, M N Eshwarappa

Multiple data need to be processed to detect cancer with the mammography image, which results in computational complexity for the detection approach (Heidari et al. 2018). To overcome this, FPGA is implemented, which opens a new path for designing such a system (Ahmed et al. 2020). But the FPGA implementation sometimes loses the precision (fewer decimal digits available than software implementation). Modern deep-learning models broadly use reduced precision computation, which reduces execution inference time, memory footprint, and memory bandwidth (Rehm et al. 2021). Fixed point representation is a common digital signal/image processing method because of its high computational efficiency, minimal hardware requests, and flexibility in altering the precision for particular applications. Since floating-point values must be rounded off or truncated for the FPGA implementations to function, this causes certain inaccuracies in the results (Chandra et al. 2021). One way to tackle these issues is by combining floating-point and fixed-point operations when needed. Mixed precision operations are used to find the best balance between accuracy and complexity in hardware implementation.