Scikit-learn – Knowledge and References

Explore chapters and articles related to this topic

Implementation

Published in Seyedeh Leili Mirtaheri, Reza Shahbazian, Machine Learning Theory to Applications, 2022

Seyedeh Leili Mirtaheri, Reza Shahbazian

Scikit-Learn is generally known as a leading open-source tool for Python in which there is an inclusive library for Machine Learning algorithms. David Cournapeau started it as a Google Summer of Code project. Following 2015, Scikit-Learn is under active development sponsored by Telecom ParisTech, INRIA, and sometimes Google through its Summer of Code. Scikit-Learn has broadened the functionality of SciPy and NumPy packages along with many Machine Learning algorithms while offering functions to perform regression, classification, clustering, model selection, preprocessing, and dimensionality reduction. Matplotlib package is also used by Scikit-Learn to plot charts. From April 2016, it is conveyed in together-developed Anaconda for Cloudera project on Hadoop clusters. Along with Scikit-Learn, Anaconda consists of many popular packages for science, mathematics, and engineering for the Python ecosystem, namely Pandas, SciPy and NumPy.

Machine Learning

View Chapter

Purchase Book

Published in Shrirang Ambaji Kulkarni, Varadraj P. Gurupur, Steven L. Fernandes, Introduction to IoT with Machine Learning and Image Processing using Raspberry Pi, 2020

Shrirang Ambaji Kulkarni, Varadraj P. Gurupur, Steven L. Fernandes

Scikit-learn is a library for Python Programming language that supports machine learning. It supports machine-learning features like classification, regression and clustering along with APIs to support popular algorithms like Linear RegressionLogistic RegressionNaive BayesNeural NetworkSupport Vector MachineRandom ForestK N NLDA

Solar Power Forecasting

View Chapter

Purchase Book

Published in Bhavnesh Kumar, Bhanu Pratap, Vivek Shrivastava, Artificial Intelligence for Solar Photovoltaic Systems, 2023

Agrim Khurana, Ankit Dabas, Vaibhav Dhand, Rahul Kumar, Bhavnesh Kumar, Arjun Tyagi

Components of scikit-learn: Scikit-learn is loaded with a lot of helpful and useful features including Supervised-Learning Algorithms. Ranging from Generalized linear models (e.g., Linear Regression), Support Vector Machines (SVM) (Zeng and Qiao, 2013), Decision Trees to Bayesian methods—all of them are part of the scikit-learn toolbox. The diversity of machine learning algorithms present in it is one of the big reasons for the high usage of scikit-learn.

Leak Detection in Smart Water Grids Using EPANET and Machine Learning Techniques

View Article

Journal Information

Published in IETE Journal of Education, 2021

Anirudh Nagaraj, Ganesh Reddy Kotamreddy, Pooja Choudhary, Rahul Katiyar, B.A. Botre

Each dataset consisted of 576 datapoints, with equal number of leak-case and non-leak-case datapoints. Each dataset was then shuffled, and split into train and test datasets, with the number of datapoints in the test dataset being 20% of the total datapoints. The training datasets were then used separately for training classification-based ML models. The ML models were simulated, built, trained and tested using Scikit-learn, which is an open-source, commercially usable Python-based Machine Learning library. Scikit-learn is easy to use and provides state-of-the-art implementations of widely used Machine Learning algorithms [32]. MinMax scaler was also used, which scales each feature separately, such that the values are within a specified range. The range used in this study was 0–1. The accuracy obtained on the test datasets as reported in Table 2 is the greater of the accuracies obtained with and without scaling the data using the MinMax scaler.

Prediction of Marshall design parameters of asphalt mixtures via machine learning algorithms based on literature data

View Article

Journal Information

Published in Road Materials and Pavement Design, 2023

Mert Atakan, Kürşat Yıldız

In order to assess the prediction performance of the models, we used the score function in the scikit-learn library. This function returns the coefficient of determination, namely R2 value. It was calculated as Equation (4) where RSS is the residual sum of squares and TSS is the total sum of squares. In Equation (2), yi is ith value to be predicted, f(Xi) is the predicted value of yi, and n is the upper limit of summation. In Equation (3), yi is ith value in sample, is the mean value of the sample and n is the upper limit of summation.

Modelling urban expansion with cellular automata supported by urban growth intensity over time

View Article

Journal Information

Published in Annals of GIS, 2023

Jinqu Zhang, Donglin Wu, A-Xing Zhu, Yunqiang Zhu

The CA model was implemented in a Python 3.7 environment with scikit-learn, matplotlib, and numpy supported. Scikit-learn is a robust library with a range of supervised and unsupervised learning algorithms that can be used to simplify machine learning applications in a production system. With the support of the scikit-learn library, SVM, neural network, and logistic regression models were all tried to predict the urban cell state.