Explore chapters and articles related to this topic
Machine Learning for Disease Classification: A Perspective
Published in Kayvan Najarian, Delaram Kahrobaei, Enrique Domínguez, Reza Soroushmehr, Artificial Intelligence in Healthcare and Medicine, 2022
Jonathan Parkinson, Jonalyn H. DeCastro, Brett Goldsmith, Kiana Aran
When equipped with an appropriate kernel function, kernel methods are flexible enough to “learn” and approximate nearly any relationship present in the training set. But their flexibility leaves them prone to overfitting, which must be avoided through careful hyperparameter tuning. For Gaussian process regression, good hyperparameters can be “learned” from the data during training (Rasmussen & Williams, 2006); for the other algorithms, hyperparameters must generally be tuned through cross-validation. The performance of kernel methods is heavily reliant on selecting the right kernel, and the choice of kernel function can also be challenging for many types of data. It is not immediately clear, for example, what kernel function is most appropriate to assess the similarity of two chest x-ray images or of two patient histories. Finally, kernel methods exhibit poor scaling with dataset size. A straightforward implementation leads to training time complexity of O(N3) – in other words, the number of calculations required to fit the model increases in proportion to the cube of the number of datapoints. This limitation arises from the need to generate an N-by-N covariance matrix for N datapoints and compute its Cholesky decomposition. This drawback has traditionally limited the application of kernel methods to datasets containing at most a few tens of thousands of datapoints (Titsias, 2009). Since a kernel method must assess the similarity between a new datapoint and all datapoints in its training set before making a prediction, prediction times increase with the size of the training set as well.
Numerical evaluation of fragility curves for earthquake liquefaction induced settlements of a levee using Gaussian Processes
Published in António S. Cardoso, José L. Borges, Pedro A. Costa, António T. Gomes, José C. Marques, Castorina S. Vieira, Numerical Methods in Geotechnical Engineering IX, 2018
A meta-model is an analytical function used to provide rapid approximations of more expensive models (e.g. an analytical model or a finite element numerical model). In the Gaussian process (GP), the responses and input values are combined statistically to create functional relationships in a non-intrusive approach (i.e. the original model is considered as a black box). One of the advantages of Gaussian processes is that they are flexible enough to represent a wide variety of complex models using a limited number of parameters (Sacks et al. 1989).
Probabilistic nonlinear dimensionality reduction through gaussian process latent variable models: An overview
Published in Arun Kumar Sinha, John Pradeep Darsy, Computer-Aided Developments: Electronics and Communication, 2019
The element at ith row and jth column of K is given by the prior distribution (36). Thus, the marginal likelihood of dual probabilistic PCA is a product of d independent Gaussian processes. The covariance function of a Gaussian process describes the properties of functions, such as variability. Learning in Gaussian processes is to determine hyperparameters of a covariance function that is suitable for the problem being modelled.
Hierarchical active learning for defect localization in 3D systems
Published in IISE Transactions on Healthcare Systems Engineering, 2023
The Gaussian process (GP), a powerful and flexible probabilistic framework, is widely used for predictive modeling and uncertainty quantification. However, large-scale GP training requires calculating the inverse and determinant of a high-dimensional kernel matrix at each training iteration, incurring a significant burden on the computational effort. Different advanced techniques have been developed to cope with such computational complexity. For example, Wang et al. (2019) relieved the computational constraint utilizing multi-GPU parallelization and linear conjugate gradients, which enabled large-scale GP training with million training points. Chen et al. (2020) further provided a convergence guarantee for mini-batch stochastic gradient descent for large-scale GP training. While these methods indeed solve the scalability issue of GPs, it is important to note that they are designed for situations where a large number of observations are readily available. However, training data collection for simulation calibration involves repeatedly evaluating the simulation model, making it computationally infeasible to obtain numerous observations for expensive simulation modeling that involves a complex 3D geometry.
Magnitude Type Conversion Models for Earthquakes in Turkey and Its Vicinity with Machine Learning Algorithms
Published in Journal of Earthquake Engineering, 2023
Kaan Hakan Coban, Nilgun Sayil
where P(w|y.X) is updated with the examination data received from the final distribution, previous distributions, and employed data sets. In this model, various core matrices are applied such as the exponential, squared exponential, rational quadratic, and matern 5/2 core matrices. The advantage of the Gauss regression model is that it directly determines the model uncertainty and can be added with preliminary information and features about the shape of the model by using different core matrices. The most significant disadvantage of the model is the long computation time. Gaussian processes are chosen as a method frequently applied in regression and especially prediction problems, for instance, predicting soil mapping, obtaining a mapping from the state of the robotic arm according to its positions, etc.
Autonomous materials discovery and manufacturing (AMDM): A review and perspectives
Published in IISE Transactions, 2023
The introduction of Gaussian Process Regression (GPR) in spatial statistics (Cressie, 1991) and their subsequent adoption to the model the relationships from computer experiments (Sacks et al., 1989, Santner et al., 2013) brought a paradigm shift. Gaussian process models are nonparametric in nature, and they provide a great degree of flexibility and adaptivity in modeling complex response surfaces. Subsequently, GPR has emerged as a popular general-purpose method for broad machine learning applications (Williams and Rasmussen, 2006). A GPR model offers flexibility, numerical stability, and the capability for uncertainty quantification (Matheron, 1963; McKay et al., 1979;, Shewry and Wynn, 1987; Sacks et al.1989; Fang et al., 2005; Williams and Rasmussen, 2006; Santner et al., 2013; Molina et al., 2015; Sun et al., 2019). More pertinently, GPR models with Bayesian Optimization (BO) techniques enable simultaneous learning of the complex functional relationship as well as searching for the best design/process (input) settings (Mockus, 2012) employing a sequential experimentation strategy. Every BO algorithm defines a sequential sampling procedure, which successively generates new input points, based on optimizing an acquisition function—sort of an objective function that quantifies MDS search and materials discovery goal—by employing a specified acquisition search strategy as noted earlier.