Explore chapters and articles related to this topic
Unsupervised Learning Methods as Tools for Discovering Relationships Within Data
Published in Durgesh Kumar Mishra, Nilanjan Dey, Bharat Singh Deora, Amit Joshi, ICT for Competitive Strategies, 2020
Salim Qureshi, Shafalika Vijayal
Must invert a matrix categorical variable are expanded to a set of indicator variables, one for each possible value. Linear regressions are of the form y is equal to a constant term + a linear combination of all the variables. The linear combinations are made up of a coefficient term “bi” multiplied by the value of the corresponding variable. The issue explains for bi. The issue identifies with the framework reversal and the strategy utilized is Ordinary Least Squares (OLS). The answer requires storage as the square of the quantity of factors and requires matrix inversion. The complexity of the arrangement increments as the quantity of factors increment. At the point when you have categorical factors, they are extended as a lot of pointer factors one for every conceivable value.
Building Neural Networks with Weka
Published in Richard J. Roiger, Data Mining, 2017
Figure 9.8 also shows an additional weighted link associated with each node. The weights are the rounded threshold values shown in Figure 9.7. The threshold serves a purpose similar to the constant term in a regression equation and functions as another connection weight between two nodes. The 1 at one end of the link represents a node whose output is always 1. The threshold values change during the learning process in the same manner as all other network weights.
Aviation Forecasting and Regression Analysis
Published in Bijan Vasigh, Ken Fleming, Thomas Tacker, Introduction to Air Transport Economics, 2018
Bijan Vasigh, Ken Fleming, Thomas Tacker
The third major table contained in all regression output is a table of coefficients. This is displayed for the demand forecast from Orlando to Los Angeles in Table 10.23. The coefficients table allows the researcher to construct a linear equation that can be used for forecasting, and it also determines whether the individual variables are statistically significant. The first column of the coefficients table lists all the independent variables used in the analysis, plus the constant. The constant term is usually interpreted as the value of the dependent variable when all the other independent variables are set to zero. Columns 2 and 4 both display values for the coefficients. The standardized values (column 4) are generally used to compare the respective size of the impacts of the independent variables on the dependent variable. This is accomplished by calculating them in standardized units—that is, the standardized coefficient is the unstandardized value of the coefficient multiplied by the ratio of the standard deviation of the independent variable to the standard deviation of the dependent variable. Therefore, a standardized coefficient of 1.14, as the one for GDP, means that a 1.0 standard deviation change in the independent variable will lead to a 1.14 standard deviation change in the dependent variable. Similar interpretations apply to the other standardized coefficients. But, since the unstandardized values are the coefficients that are directly applicable to forecasting actual values, the unstandardized beta values are the coefficients that are used in the forecast equation. However, and as a final step prior to forming a demand equation, each independent variable needs to be tested to see if it is statistically significant.
Analysis of classical and machine learning based short-term and mid-term load forecasting for smart grid
Published in International Journal of Sustainable Energy, 2021
Support vector machine (SVM) is a powerful machine learning tool based on supervised learning for classification and regression analysis which is named SVC and SVR. SVR is a non-parametric technique as it relies on the kernel function. The kernel is a way of calculating the dot product of two vectors in a very high dimensional feature space, and the kernel function is a function used to map lower-dimensional data into higher dimensional data (Zhang, Wang, and Zhang 2017). The kernel can be linear, polynomial, and radial basis function (RBF) or Gaussian kernel. The choice of kernel function depends on the type of dataset used for modeling. In this work, a polynomial kernel is used to train the model, since the data used is highly nonlinear. The polynomial kernel is defined by Equation (8) as: Where; n is the degree of the polynomial and c is the constant term (Zhang, Wang, and Zhang 2017).
Factors influencing existing medium-sized commercial building energy retrofits to achieve the net zero energy goal in the United States
Published in Building Research & Information, 2021
The most influential variables within each category were then combined into one logistic model that is explained below. The goal of using a logistic model for analysis was to verify whether the most influential variables identified from the different categories could be combined and used to predict the success of other/future energy retrofit projects. The difference between a logistic regression model and linear regression model is the response variable (Al-Ghamdi, 2002). In the former, the response is binary or dichotomous. In this study, we recoded the zEPI score: we assumed zEPI > −3 had less opportunity to eventually achieve the net zero target and would be a failure while zEPI < −3 would be successful. The reason we used −3 as a threshold was because the median zEPI score of the 619 projects in the NBI database was −3. The logistic model created is illustrated in Equation (2):where E denotes the possibility of the energy retrofit project to succeed: = 1 if successful and 0 = if unsuccessful. is the coefficient on the constant term. denotes a model parameter (the most influencing variable), X as a value of the independent variable, and e as being the error term. For testing the logistic model, we used the four remaining renovation projects from the building database we created using the NBI information.
Insights into parameter estimation for thermal response tests on borehole heat exchangers
Published in Science and Technology for the Built Environment, 2019
Beier (2018) found that the estimate of cs from the derivative curve is more stable than the estimate from the temperature curve. Furthermore, the estimate from the derivative curve often has smaller uncertainty. Thus, the derivative curve is used here for estimating cs. As an example, parameter estimation of cs, ks, and R*bwc has been carried out over the entire test period based on the experimental temperature derivative curve for borehole 4. The resulting curve fit is shown as the dashed curve in Figure 10. The estimated parameters are listed in the first line under borehole 4 in Table 5. Note that the constant R*bwoc does not affect the derivative curve and cannot be evaluated from the derivative curve. The derivative of this constant term is zero. Instead, the component of borehole resistance with heat capacity, R*bwc, is seen by the derivative. Then, the resulting estimate of 2500 kJ/(m3·K) for cs is carried forward and held fixed, as the values of ks, Rb*, and R*bwoc are revised from the match of the temperature data. The resulting match fits both the temperature and derivative data in Figure 10 (solid curves).