Explore chapters and articles related to this topic
Array and Matrix Operations
Published in David E. Clough, Steven C. Chapra, Introduction to Engineering and Scientific Computing with Python, 2023
David E. Clough, Steven C. Chapra
The result of applying the inverse to a quantity, whether via division or a function, is unity, 1. When we consider the inverse of a matrix, multiplication by the inverse cannot yield unity but rather results in a matrix of 1’s on the diagonal, also known as an identity matrix. A⋅A−1=I
LUT-Based Matrix Multiplier Circuit Using Pigeonhole Principle
Published in Hafiz Md. Hasan Babu, VLSI Circuits and Embedded Systems, 2023
Matrix multiplication is a computationally-intensive and fundamental matrix operation used in scientific computations, signal processing, image processing, graphics and robotic applications. The advancement of Field Programmable Gate Arrays (FPGAs) in the recent years, allowing multimillion gates on a single chip, has allowed the implementation of computation-intensive algorithms like matrix multiplication in efficient and cost-effective way. As multiplication is the slowest operation that is hindering the performance of matrix multiplication, efficient FPGA-based multiplication algorithm is introduced. Besides, LUT (Look-up Table) is the key component of FPGA which can implement any function. A LUT merging theorem is presented, which reduces the required number of LUTs for the implementation of a set of functions by a factor of two. A (1 × 1)-digit multiplication algorithm is introduced which does not require any partial product generation, partial product reduction and addition steps. An (m×n)-digit multiplication algorithm is described which performs digit-wise parallel processing and provides a significant reduction in carry propagation delay. A binary to BCD conversion algorithm for decimal multiplication is also presented to make the multiplication more efficient. Then, a matrix multiplication algorithm is described that re-utilizes the intermediate product for the repeated values to reduce the effective area. A cost-efficient LUT-based matrix multiplier circuit is also described using the compact and faster multiplier circuit. Due to the parallel processing structure and the implementation of re-utilization of the intermediate product for the repeated values, the effective area and consequently power consumption are reduced drastically.
Linear Systems
Published in Julio Sanchez, Maria P. Canton, Software Solutions for Engineers and Scientists, 2018
Julio Sanchez, Maria P. Canton
The matrix multiplication of C = A * B is rather counter-intuitive. Instead of multiplying the corresponding elements of two matrices, matrix multiplication consists of multiplying each of the entries in a row of matrix A, by each of the corresponding entries in a column of matrix B, and adding these products to obtain an entry of matrix C. For example A=[A11A12A13A21A22A23]B=[B11B12B13B14B21B22B23B24B31B32B33B34]C=A×B=[C11C12C13C14C21C22C23C24]
serpentTools: A Python Package for Expediting Analysis with Serpent
Published in Nuclear Science and Engineering, 2020
Andrew E. Johnson, Dan Kotlyar, Stefano Terlizzi, Gavin Ridley
First and foremost, serpentTools provides an easy way to view and manipulate Serpent data with Python. Much like MATLAB users can load up the environment and data using the MATLAB run command, serpentTools users can simply use the serpentTools.read function. Both functions require the name of the output file and provide the user with all of the data. Inside MATLAB, users can manipulate the data using native matrix multiplication and linear algebra operations. These operations are instead left to the NumPy and SciPy packages, a pair of highly stable, robust, and fully featured Python packages for science and engineering.
Design and realisation of an efficient environmental assessment method for 3R systems: a case study on engine remanufacturing
Published in International Journal of Production Research, 2020
Haolan Liao, Neng Shen, Yanzhen Wang
Setting as the revenue for reusing/remanufacturing one x-th component at node j (), the emissions saved by reusing/remanufacturing x-th components are . Then, the total saved emissions for component reprocessing are expressed as It is worth noting here that the operator ‘ · ’ is the vector dot product (or scalar product), ‘ * ’ indicates matrix multiplication, and ‘ .* ’ is the corresponding element multiplication in the two matrices. The specific values of unit engine remanufacturing revenue and unit component processing revenue are listed in Table 4.
Performance engineering for HEVC transform and quantization kernel on GPUs
Published in Automatika, 2020
Mate Čobrnić, Alen Duspara, Leon Dragić, Igor Piljić, Mario Kovač
The highest workload stage during HEVC TQ is the 2D forward transform which is mathematically realized as double matrix multiplication. Many guidelines exist with general optimizations principles exampled in matrix multiplication and other basic linear algebra subroutines (BLAS) [15–17]. They mainly deal with the multiplication of two large size (one or both dimensions) matrices which are tiled to small size subblocks e.g. 16×16 and distributed among GPU thread-blocks. Values from input matrices are repeatedly loaded in subblocks and the resulting subblock is computed as the sum of products of these subblocks. HEVC TQ operates with batches of small size matrices i.e. TBs where the tiling approach would degrade performance. The efficient mapping of TBs and dot product computations to various components of the GPU subsystem with the adaptation of known performance optimization techniques was set as the main design objective. The final proposal presents a systematic and efficient solution for the multithreaded GPU kernel function for all supported transform sizes in HEVC.