Explore chapters and articles related to this topic
Machine Learning - A gentle introduction
Published in Nailong Zhang, A Tour of Data Science, 2020
As we discussed in Chapter 4, there are two commonly used approaches for distribution fitting, i.e., method of moments and maximum likelihood estimation. In this case, we use the maximum likelihood estimation because the likelihood function can be easily derived as below. P=∏i=1n∑k=1Kπkf(xi∣μk,Σk),
Simulation Input Analysis
Published in Raymond J. Madachy, Daniel X. Houston, What Every Engineer Should Know About Modeling and Simulation, 2017
Raymond J. Madachy, Daniel X. Houston
As a distribution-fitting program tries each distribution in its list and calculates parameters for the candidate distribution, it calculates how well each candidate distribution actually fits the data. These calculations are a goodness-of-fit test, which is a statistical hypothesis test for which the null hypothesis is that the data belongs to the proposed distribution.
A novel stochastic model for hourly electricity load profile analysis of rural districts in Fujian, China
Published in Science and Technology for the Built Environment, 2022
Bing Zhou, Xiao Wang, Da Yan, Jieyan Xu, Xuyuan Kang, Zheng Chen, Tianyi Hao
To test the goodness of distribution fitting, the Kolmogorov–Smirnov test (K-S test) was conducted. The K-S test is a nonparametric test of the equality of continuous or discontinuous one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution. The p-value of the K-S test reflects the confidence to reject that the sample is normally distributed. The general significance level of the K-S test was 0.05 and 0.01. This study evaluated the test results under both significance levels. The hourly distributions of each typical NLP comprised the THM submodel, which separated the daily total electricity consumption into hourly consumption.
Flood estimation at Hathnikund Barrage, River Yamuna, India using the Peak-Over-Threshold method
Published in ISH Journal of Hydraulic Engineering, 2020
Mukesh Kumar, Mohammed Sharif, Sirajuddin Ahmed
In frequency analysis, it is important to fit a probability distribution to the series of the flood peaks, which may be obtained either using the AM or the POT approach. The aim of the probability distribution fitting is to select a distribution that suits the data well. Two distributions, namely Log-Pearson Type III and Gumbel’s Type I distributions, have been fitted to the series of flood peaks data obtained using the AM and the POT approach. According to Gumbel’s theory, the probability of occurrence of an event equal to or larger than a value is
Effect of operational attributes on lateral merging position characteristics at mid-block median opening
Published in Transportation Letters, 2021
Tathagatha Khan, Smruti Sourava Mohapatra
The collected LMP data of all the 14 test sections were tried to fit into a statistical distribution. In this effort, the distribution fitting toolbox of MATLAB was utilized, and different distributions, namely Gamma, Log logistic, Lognormal, Normal, and Weibull, were examined as shown in Figure 7. Distribution fitting toolbox gives the log-likelihood value, and the lowest negative value corresponds to the best fitting for the data (Qing-wan 2010). For a particular section, i.e. S-1 in six-lane DiUR, the negative log-likelihood values for different distributions, namely Gamma, Log logistic, Lognormal, Normal, and Weibull, were found to be −11,200.00, −11,269.50, −11,325.30, −11,120.40, and −11,134.30, respectively. As the lowest negative log-likelihood value corresponds to normal distribution, normal distribution was chosen as the best suited distribution to describe the observed LMP data. Subsequently, a normal distribution equation was used to calculate the theoretical frequency of the data set, and goodness of fit between theoretical frequency and observed frequency as shown in Figure 7 (c) was analyzed by the chi-square test. A similar analysis was done for S-8 in four-lane DiUR. The negative log-likelihood value was found to be −2534.53, −2541.75, −2536.35, −2574.97, and −2551.41 for different distributions, namely Gamma, Log logistic, Lognormal, Normal, and Weibull, respectively. As the lowest negative log-likelihood value corresponds to gamma distribution, gamma distribution was chosen as the best suited distribution to describe the observed LMP data. Subsequently, the gamma distribution equation was used to calculate the theoretical frequency of the data set, and goodness of fit between theoretical frequency and observed frequency as shown in Figure 7 (d) was analyzed by the chi-square test.