Universal approximation theorem – Knowledge and References

Explore chapters and articles related to this topic

Multi-objective parametric optimization of wire electric discharge machining for Die Hard Steels using supervised machine learning techniques

Published in Rajeev Agrawal, J. Paulo Davim, Maria L. R. Varela, Monica Sharma, Industry 4.0 and Climate Change, 2023

Pratyush Bhatt, Pranav Taneja, Navriti Gupta

The MSE values for Ra and MRR produced by ANN are 0.1425 and 0.2056, respectively. The error values are minimized (Figure 8.8) and the predicted values overlap with the experimental values as well (Figure 8.9). Thus, ANN accurately models the WEDM process, even for such a small dataset. ANN is capable of modeling complex non-linear functions because of different weights and biases associated with the neurons and the activation functions of the layers. After the neural network is trained for a sufficient number of epochs, the optimizer is able to set the weights so as to precisely map any continuous function, which has been proved using the universal approximation theorem [27]. The accuracy of the model will further increase with an increase in the size of the dataset and possible overfitting will also be avoided.

GEMNet II – An alternative method for grade estimation

View Chapter

Purchase Book

Published in G. N. Panagiotou, T. N. Michalakopoulos, Mine Planning and Equipment Selection 2000, 2018

I.K. Kapageridis, B. Denby, D. Schofield

It is necessary before carrying on to the application of RBF networks for grade estimation to examine their architecture and general operation. RBFs were initially used for solving problems of real multivariate interpolation. Work on this subject has been extensively surveyed by Powell (1990). The theory of RBFs is one of the main fields of study in numerical analysis (Powel 1981). RBF networks are very simple structures. Their design is in essence a problem of curve fitting in a high-dimensional space. Learning in RBF networks means finding the hyper-surface in multidimensional space that fits the training data in the best possible way. The universal approximation theorem for RBF networks, as stated by Park and Sandberg (1991), opened the way for their use in function approximation problems, which were commonly approached using Multi-Layered Perceptrons. The work of Park and Sandberg (1991, 1993), Cybenko (1989), and Poggio and Girosi (1990) led to a new model for function approximation based on generalised RBF networks. Specifically, the theorem can be stated as below:

Machine Learning - A gentle introduction

View Chapter

Purchase Book

Published in Nailong Zhang, A Tour of Data Science, 2020

Nailong Zhang

The universal approximation theorem says that a single hidden layer neural network can approximate any continuous functions (Rn → R) with sufficient number of neurons under mild assumptions on the activation function (for example, the sigmoid activation function) [1]. There are also other universal approximators, such as the decision trees.

Advanced processing of 3D printed biocomposite materials using artificial intelligence

View Article

Journal Information

Published in Materials and Manufacturing Processes, 2022

Deepak Verma, Yu Dong, Mohit Sharma, Arun Kumar Chaudhary

Hornik and coworkers [88] proposed a theorem known as universal approximation theorem (UAT) in 1991, which proved that the feed-forward neural network could estimate any continuous function on a compact subset, having one hidden layer and finite numbers of neurons. However, because of the lack of information and difficulty encountered during this software learning, the other feedforward neural networks (FFNNs) has more than one hidden layer. Another problem in learning these networks arose because of the network width, which became exponentially large. More interestingly, the UAT can also be used for the FFNNs having numerous hidden layers and being bounded with hidden neurons .[89] Owing to these limitations, D-FFNNs have come into effect for practical learnability.

Leveraging machine learning for predicting human body model response in restraint design simulations

View Article

Journal Information

Published in Computer Methods in Biomechanics and Biomedical Engineering, 2021

Hamed Joodaki, Bronislaw Gepner, Jason Kerrigan

A NN is an interconnected group of nodes that are designed to recognize a pattern (Zou et al. 2008). It consists of an input layer, a group of hidden layers, and an output layer. Each of these layers has one or a set of nodes, which are also called artificial neurons. The independent variables (restraint parameters in this study) form the nodes of the input layer. The nodes of each layer are related to the previous layer nodes using a transfer function (e.g. linear, logistic, Gaussian, etc.). Thus, the nodes of first hidden layer are a function of the input layer, and the nodes of second hidden layer are a function of the first hidden layer, etc. The networks developed in this study had only one hidden layer. Finally, the output (LYL in this study) will be a function (linear in this study) of the last hidden layer (Figure 3(a)). The weight vectors of the network should be tuned to minimize the loss function through a training algorithm. According to the Universal Approximation Theorem, a network with a single hidden layer is sufficient to represent any function (Park and Sandberg 1991). Thus, this technique could be potentially suitable for predicting the results of restraint design simulations, which were expected to be non-linear.

Machine learning for collective variable discovery and enhanced sampling in biomolecular simulation

View Article

Journal Information

Published in Molecular Physics, 2020

Hythem Sidky, Wei Chen, Andrew L. Ferguson

Artificial neural networks (ANNs) are collections of activation functions, or neurons, which are composited together into layers in order to approximate a given function of interest [112]. Their utility and power can be largely attributed to the universal approximation theorem, [113,114], which states that, under mild assumptions, there exists a finite-size neural network that is capable of approximating any continuous function to arbitrary precision. In a fully-connected ANN, the neurons in each layer take as their inputs the outputs from the previous layer, apply a nonlinear activation function, and pass on their outputs to the next layer. A schematic diagram of a three-layer feed-forward fully-connected neural network in Figure 1. Mathematically, the output from neuron i of fully connected layer k is given by, where and define the layer weights and biases, respectively. The activation function is an arbitrary nonlinear function but is often taken to be or some form of rectified linear unit (ReLU) and is applied element-wise to the input. ANNs are typically trained by minimising an objective function (also called loss function) using some variant of stochastic gradient descent through a process known as backpropagation [115–117].