Lasso – Knowledge and References

Explore chapters and articles related to this topic

Using financial concepts to understand failing construction contractors

Published in Rick Best, Jim Meikle, Describing Construction, 2023

Balcaen and Ooghe (2006) highlighted various challenges associated with applying statistical tools to predict business failure, proposing that application of artificial intelligence (AI) tools could resolve methodological challenges. Alaka et al. (2016a: 809) state ‘AI tools have become gradually more popular, … [as] yet only little or no progress has been made’. More recent IPM studies have made use of artificial intelligence/machine learning techniques for the selection of predictors. This is especially useful with approaches, involving a large number of financial concepts, which assess potential to predict failure (Kulakov 2017). Lasso techniques, a machine learning statistical approach,5 apply penalties to variables that are both highly correlated to other variables which are more powerful predictors, as well as penalising other variables where data availability is inconsistent through time.

Generalized Regression Penalty against Complexity

View Chapter

Purchase Book

Published in Chong Ho Alex Yu, Data Mining and Exploration, 2022

Chong Ho Alex Yu

Obviously, LASSO and ridge operate in opposite directions: the former tends to retain a simple model by zeroing out unimportant predictors, whereas the latter tries to keep more variables by shrinking their coefficients towards zero, but not exactly zero. The elastic net approach is the happy medium in the sense that it combines the penalties of the LASSO and ridge approaches (Weighted average of L1 + L2) (Zou and Hastie 2005). The cost is that there would be a small increase in bias, but the benefit, which is a substantial decrease in variable, definitely outweighs the cost. This approach is considered superior to LASSO for several reasons. First, in an ultra-high dimensional data set (p > n) LASSO selects at most n variables before it saturates, and therefore some important variables might be missed. In contrast, the elastic net method can select more than n variables because this shortcoming of LASSO is overcome by ridge regularization. Second, when the variables are collinear or highly correlated, LASSO tends to select only one of them. But the elastic method can retain more. Third, in the satiation that n > p but the variables are collinear or highly correlated, the predictive performance of LASSO is overshadowed by ridge. When the elastic net is used, performance accuracy is improved because ridge is in the equation.

Machine Learning Results for High Utilizers

View Chapter

Purchase Book

Published in Chengliang Yang, Chris Delcher, Elizabeth Shenkman, Sanjay Ranka, Data-Driven Approaches for Health care, 2019

Chengliang Yang, Chris Delcher, Elizabeth Shenkman, Sanjay Ranka

Comparing GBM and LASSO’s results in Table 6.4 reveals that GBM identifies many likely, highly correlated predictors, such as the prior history variables. In contrast, predictors with the largest coefficients in LASSO seem independent from each other. Among a set of highly correlated predictors, LASSO usually selects one to have a nonzero coefficient and the rest are assigned zero coefficients. This feature reduces information redundancy in presenting predictor importance. This selection, however, is performed automatically and is not informed by clinical logic. In contrast, GBM relies on the approximate global loss function gain in each split. If a set of predictors are informative at the population level, they will appear more frequently when assembling the decision trees, causing redundancies in the predictor importance table. In this way, LASSO appears to find useful information more efficiently.

Empirical analysis of the impact of collaborative care in internal medicine: Applications to length of stay, readmissions, and discharge planning

View Article

Journal Information

Published in IISE Transactions on Healthcare Systems Engineering, 2023

Paul M. Cronin, Douglas J. Morrice, Jonathan F. Bard, Luci K. Leykum

The backward selection regression model estimated via BIC minimization criteria served as a baseline model to compare with our favored approach for this problem, the elastic net model. The elastic net serves as a hybrid combination of the lasso and ridge regression frameworks. Least absolute shrinkage and selection operation (lasso) and ridge estimation are two shrinkage techniques that can guard against overfitting and reduce model complexity (Hoerl & Kennard, 1970, Tibshirani, 1996). More precisely, lasso and ridge regression are forms of penalized regression which apply the L1 and L2 norms, respectively, to the objective function. In the case of ridge, the coefficients are shrunk toward zero, but the model ultimately includes all the coefficients. One of the key features is that a minor increase in bias is rewarded with hopefully reduced error variance and more reliable coefficient t-tests (James et al., 2021). Lasso differs from ridge in several ways, including simultaneously performing both estimation and variable selection by shrinking less important variable coefficients to zero (not just toward).

When should MI-BCI feature optimization include prior knowledge, and which one?

View Article

Journal Information

Published in Brain-Computer Interfaces, 2022

Camille Benaroch, Maria Sayu Yamamoto, Aline Roc, Pauline Dreyer, Camille Jeunet, Fabien Lotte

Statistical model to predict/explain MI-BCI performances: We used a LASSO [14] regression to obtain models that could predict the performances of MI-BCI users from the characteristics of the MDFB. The LASSO regression uses an L1 norm regularization with a penalty parameter (see Eq. 4) that promotes sparse solutions, i.e. that selects only a small number of variables (many coefficients will be zero using this regularization). It is particularly adapted to reduce the number of relevant features when those features are more numerous than the training data [20] and enables the creation of interpretable models. As for any linear regression set up, we have a continuous output vector (here the MI-BCI performance to be explained/predicted), a matrix of normalized features using a z-score normalization (here the users’ MDFB characteristics) for examples (the subjects) and a coefficient vector (the regression weight). The LASSO estimator is defined as:

High-Dimensional Cost-constrained Regression Via Nonconvex Optimization

View Article

Journal Information

Published in Technometrics, 2022

Guan Yu, Haoda Fu, Yufeng Liu

After the data preprocessing, our final dataset has 181 diabetes patients with one continuous response variable (the change in HbA1c from baseline to the end of the study) and 20 predictors. Considering the group structure of the predictors in this study, we used the HCR method (19) for the data with groups of variables to develop cost-constrained models for a sequence of budgets (from $0 to $200). We also used the LASSO penalty to reduce overfitting. We compared this HCR method with the LASSO and the Group LASSO (Yuan and Lin 2006) methods. For LASSO and Group LASSO, in order to choose the best tuning parameter λ, we used the cross-validation to choose the best tuning parameter that delivered the feasible solution (a solution that satisfies the budget constraint) with the lowest mean cross-validated error. In order to check the prediction performance of the linear model considering all predictors, we also used the LASSO method ignoring the budget constraint. We use LASSO1 and LASSO2 to denote the LASSO method considering the budget constraint and ignoring the constraint, respectively.