MKL – Knowledge and References

Explore chapters and articles related to this topic

Artificial Intelligence Software and Hardware Platforms

Published in Mazin Gilbert, Artificial Intelligence for Autonomous Networks, 2018

Rajesh Gadiyar, Tong Zhang, Ananth Sankaranarayanan

Intel Math Kernel Library (MKL) has highly optimized, threaded, and vectorized functions for dense and sparse linear algebra (BLAS, LAPACK, and PARDISO), Fast Fourier Transforms (FFTs), vector math, summary statistics, deep neural network (DNN) primitives, and more. MKL is offered as a binary format that higher level programming languages can link to while compiling to take full advantage of Intel silicon capabilities.

Few-View CT Image Reconstruction via Least-Squares Methods: Assessment and Optimization

View Article

Journal Information

Published in Nuclear Science and Engineering, 2023

Mónica Chillarón Pérez, Vicente E. Vidal, Gumersindo J. Verdú, Gregorio Quintana-Ortí

In this paper, a high-performance implementation of these algorithms has been developed. The implementation uses Intel’s Math Kernel Library (MKL), which employs BLAS and LAPACK routines. It is worth mentioning that three modifications to the original algorithms proposed by the authors have been made. First, we take as the initial solution and use the residual matrix to compute its QR factorization in the initialization. The original algorithms do not take an initial solution. Then, we iterate times and return the solution . We ignore the stopping criteria proposed by the authors in Refs. [18] and [19]. The number of iterations is fixed because the methods will be combined with the filtering and regularization techniques. Last, we have adapted the algorithms to work with rectangular matrices. In the original algorithms, they work with only square matrices. However, having square matrices for high-resolution images would imply adjusting the number of projections so the system matrix is square. For instance, for a reconstruction with image resolution and a scanner with 1024 detectors, we could only solve the problem with 64 projections if the matrix can only be square. By adapting the method

Deep reinforcement learning-based antilock braking algorithm

View Article

Journal Information

Published in Vehicle System Dynamics, 2023

V. Krishna Teja Mantripragada, R. Krishna Kumar

The algorithms developed and tested so far are under software-in-loop scenario running on workstations. However, it is imperative to ensure real-time compatibility. For evaluation on target hardware, only the actor network needs to be deployed. Care has been taken to keep the size of neural networks in Figure 4(b) low to comply with hard real time conditions and eliminate need for GPUs (graphical processing units). Modern embedded chips coupled with automatic optimised code-generation tools and libraries are capable of processing images in real time through complex neural network layers consisting of millions of parameters. The trained actor network presented in this article is converted to C code using MATLAB-Simulink Code Generator with Intel MKL-DNN (Math Kernel Library for Deep Neural Networks) library. The generated codes are deployed on to the hardware-in-loop setup at the Virtual Proving Ground facility, Indian Institute of Technology Madras. The facility, shown in Figure 11, consists of IPG Carmaker which is a commercial vehicle dynamics simulation software, IPG-Xpack4 which is a real-time hardware, steering and pedal actuators. The deployed actor network, on an average takes 6 microseconds per inference during ABS operation.

An approach to the analysis of hydroelastic vibrations of electromechanical systems, based on the solution of the non-classical eigenvalue problem

View Article

Journal Information

Published in Mechanics of Advanced Materials and Structures, 2021

Sergey V. Lekomtsev, Dmitrii A. Oshmarin, Natalya V. Sevodina

Equation (15) is formed and solved by the Mueller method using a program written in FORTRAN. All necessary matrices are uploaded into it from the binary files exported from the ANSYS software in the required way [23]. The block diagram of the algorithm is shown in Figure 2. The basic operations with sparse matrices and the LU decomposition used to calculate the determinant in the Mueller method were performed using the Intel Math Kernel Library with parallel computing support. The performance of the program and the reliability of solution to the non-classical eigenvalue problem (15) were verified in [30]. It compares the obtained natural vibration frequencies of the plate with a piezoelectric element located on the fluid layer with the values calculated using the ANSYS package. As an example, we performed calculations for the two limiting cases in the operation of the piezoelectric element (open circuit and short circuit), for which the discrepancy between the results did not exceed a relative error of 0.02%.