Machine learning – Knowledge and References

Explore chapters and articles related to this topic

A Review on the Different Regression Analysis in Supervised Learning

Published in K Hemachandran, Shubham Tayal, Preetha Mary George, Parveen Singla, Utku Kose, Bayesian Reasoning and Gaussian Processes for Machine Learning Applications, 2022

K Sudhaman, Mahesh Akuthota, Sandip Kumar Chaurasiya

In polynomial regression, the relation between the continuous parameter (y) and the predictor parameter (x) is designed as something of an nth dimension polynomial. It deals with a nonlinear data set using a linear model. It is tantamount to multiple linear regression. When the linear regression model is unable to capture the pattern of nonlinear data set, the problem of underfitting arises, where an underfit machine learning model will have underperformed on the training data. To avoid this, an underfit polynomial regression is used for nonlinear data which fits the nonlinear relationship between the value of x, which is the independent variable and the values of the dependent variables of y, which is the target variable more precisely. (“Machine Learning Polynomial Regression - Javatpoint” 2020) (Figure 2.8).

Swarm Intelligence and Machine Learning Algorithms for Cancer Diagnosis

View Chapter

Purchase Book

Published in Shikha Agrawal, Manish Gupta, Jitendra Agrawal, Dac-Nhuong Le, Kamlesh Kumar Gupta, Swarm Intelligence and Machine Learning, 2022

Pankaj Sharma, Vinay Jain, Mukul Tailang

Supervised and unsupervised learning and reinforcement learning are the three types of machine learning processes. The set of statistics input into the system in supervised learning is connected to a preset outcome. These tasks are generally classified as being either regression or categorisations, such as estimating the chance that somehow a malignancy will progress to a specific care [19]. The information is separated into two cohorts: development and examination. The earlier is being used to construct a mathematical formula, although the latter is included to assess its universal applicability. Unsupervised learning is useful whenever a specific outcome is uncertain and whenever investigators are looking for new tendencies in data [20–21]. A knowledge base assists in assessing how informative or intriguing unsupervised performance of the model is. Additional data testing will be performed for both unsupervised and supervised techniques to be assessed for generalizability [20]. Reinforcement learning is a form of learning which emphasizes on training a machine to govern itself in order to achieve a long-term objective by optimizing a quantitative outcome measure. The individuals receive only incomplete information on their expectations in reinforcement learning, as opposed to assisted grasping. Additionally, predictions might have protracted repercussions by influencing the future status of the controlled system. As a consequence, time is extremely important. The goal of reinforcement learning is to develop efficient learning algorithms whilst fully comprehending their advantages and disadvantages [21].

Machine Learning in Acoustic DSP

View Chapter

Purchase Book

Published in Francis F. Li, Trevor J. Cox, Digital Signal Processing in Audio and Acoustical Engineering, 2019

Francis F. Li, Trevor J. Cox

In contrast, under-fitting means that a statistical model does not adequately represent or model the underlying structure of the data. Under-fitting often occurs when a model does not have adequate representation capability, has missed parameters, or has incorrect parameters. A typical example is the use of a straight line to fit data points showing an exponential decay process. Under-fitting can occur in supervised machine learning for a number of reasons, including inappropriate machine learning methods/models, an insufficient number of variables, under-training, and too small a training set. In addition, unsuitable step sizes and optimisation algorithms are also likely to cause under- or over-fitting problems in supervised learning.

A survey of the opportunities and challenges of supervised machine learning in maritime risk analysis

View Article

Journal Information

Published in Transport Reviews, 2023

Andrew Rawson, Mario Brito

A growing area of interest in transportation research is the application of machine learning methods (Wen, Xie, Jiang, Pu, & Ge, 2021). Machine learning can be described as a subset of artificial intelligence whereby computers make predictions or decisions without being explicitly programmed to perform that task. These models can be supervised, whereby, the model is constructed on data containing both input and outputs, or unsupervised, whereby structure is sought on unlabelled data. Furthermore, machine learning tasks may involve regression, predicting continuous numeric values, or classification, determining classes or types for data points. There are a multitude of applications of supervised machine learning in the maritime domain. These include anomaly detection of vessel transits (Riveiro, Pallotta, & Vespe, 2018), visual identification of vessels (Chen et al., 2018) from imagery or sensors, prediction of fuel consumption and ship efficiency (Uyanik, Karatug, & Arslanoglu, 2020) or path planning of autonomous vessels (Chen, Chen, Ma, Zeng, & Wang, 2019) amongst others. Many of these applications have a view to a possible future of autonomous shipping, supporting the technological requirements necessary for this concept to develop.

A comparative study between PCR, PLSR, and LW-PLS on the predictive performance at different data splitting ratios

View Article

Journal Information

Published in Chemical Engineering Communications, 2022

Teck Fu Thien, Wan Sieng Yeo

Parametric and non-parametric algorithms differ as the latter does not require the estimation of distribution parameters such as mean and standard deviation to obtain an algorithm (Scheff 2016; King and Eckersley 2019). Non-parametric models are generally less powerful due to the lack of supporting evidence when making conclusions on the target function (Scheff 2016). Building a model is not only dependent on the assumptions placed upon it, but there are also different types of learning methods. Types of machine learning include supervised, unsupervised, semi-supervised and reinforcement learning. For supervised learning, the machine learns the target function, to determine a correlation between known input and output variables. Supervised learning algorithms can be subdivided into regression and classification tasks depending on the objectives (Haimi et al. 2013; Vieira et al. 2020). The difference between regression and classification is the former predicts continuous values while the latter is used for categorizing input data. Examples of supervised learning techniques include multiple linear regression, PCR, PLSR, and LW-PLS (Ambika 2019; Chiplunkar and Huang 2019; Thomas 2019; Ibrahim et al. 2020).

A distributed unsupervised learning algorithm and its suitability to physical based observation

View Article

Journal Information

Published in International Journal of Parallel, Emergent and Distributed Systems, 2022

Radek Hes, Giacomo Gioroli

Unsupervised learning techniques are the set of machine learning algorithms useful when little or no labelled data is provided. Conventional techniques including K-Means [1] and variant algorithms [2,3] are confined to a predetermined number of classes present in the dataset which is unsuitable for investigatory or ‘self-driven’ learning. Techniques such as Density Based Spatial Clustering (DBScan) [4] relax this constraint but fail to resolve clusters where class boundaries are typically ill-defined. These may be caused from merging of multiple classes, no matter how sparse the overlapping tails, with outliers potentially leading to clustering where no conglomeration should exist. In either existing techniques, with large data sets, clustering becomes un-computable or not readily feasible because of the tight coupling of the information required by the algorithms. Computational complexity, storage, memory requirements and distributability of algorithms become bottlenecks limiting the usefulness of any learning technique in large data applications.