Explore chapters and articles related to this topic
Role of Statistical Methods in Data Science
Published in Pallavi Vijay Chavan, Parikshit N Mahalle, Ramchandra Mangrulkar, Idongesit Williams, Data Science, 2022
We must be aware of some of the important statistical terminologies while dealing with statistics and data science. Following are the terminologies in statistics: Population is something from where data is collected.A sample – a subset of the population.A variable – quantity or number that is countable. or you can say data item.A statistical parameter is a number that guides probability distributions, such as mean, median, mode, correlation, and covariance of the population.
Optical and visual metrics
Published in Pablo Artal, Handbook of Visual Optics, 2017
The variance, σ2, is a statistical parameter that measures how far a set of numbers are spread out around the mean and from each other. The variance of the WA is the mean of the squares of the WA minus the square of the mean of the WA. Mathematically, σWA2=1π∫02π∫01[WA(ρ,θ)]2ρdρdθ−[1π∫02π∫01WA(ρ,θ)ρdρdθ]2=WA2¯−(WA¯)2
Dragonfly algorithm–support vector machine approach for prediction the optical properties of blood
Published in Computer Methods in Biomechanics and Biomedical Engineering, 2023
Faiza Omari, Latifa Khaouane, Maamar Laidi, Abdellah Ibrir, Mohamed Roubehie Fissa, Mohamed Hentabli, Salah Hanini
Proper use and interpretation of statistical parameters are crucial in research as they provide quantitative evidence to support research findings, facilitate comparisons, and help draw valid conclusions. The correlation coefficients (r) of the relationship between the features (independent variables) and the outputs (dependent variable) is an important statistical measure that can aid in feature selection, model performance assessment, and feature engineering, ultimately leading to better model interpretability, stability, and predictive accuracy (Table 2). As indicated, there is a negative relationship between the wavelength and the targets (µa and µs), and an extremely positive correlation coefficient (r) between the hematocrit and the target in both cases (µa and µs). The (r) values prove that there is a significant linear relationship between the input features variables and the targets (µa and µs) as long as pvalue are less than 0.05 and 0.01 which is a measure of the strength of evidence against the null hypothesis in a statistical hypothesis test. This indicates that these data could be used to build a reliable model for estimating the absorption and scattering coefficients of human blood. The mean or average, it is statistical parameter that provides a measure of central tendency, indicating the typical value in a dataset, is calculated by adding up all the values in a dataset and dividing by the number of values. The means and medians values of wavelength and hematocrit are very close (for both, absorption and scattering behaviors), it is an indication that the data are approximately symmetrically distributed, and the median value of absorption coefficient is closer to the lower quartile (1.16 mm−1), it suggested that the data may be skewed towards lower values, while the oxygen saturation value is closer to the upper quartile (89.62%), therefore, it may be skewed towards higher values.