Explore chapters and articles related to this topic
A Survey of Uncertain Data Clustering Algorithms
Published in Charu C. Aggarwal, Chandan K. Reddy, Data Clustering, 2018
Another related technique discussed in [41] is that of hierarchical density-based clustering. An effective (deterministic) density-based hierarchical clustering algorithm is OPTICS [12]. We note that the core idea in OPTICS is quite similar to DBSCAN and is based on the concept of reachability distance between data points. While the method in DBSCAN defines a global density parameter which is used as a threshold in order to define reachability, the work in [41] points out that different regions in the data may have different data density, as a result of which it may not be possible to define the clusters effectively with a single density parameter. Rather, many different values of the density parameter define different (hierarchical) insights about the underlying clusters. The goal is to define an implicit output in terms of ordering data points, so that when the DBSCAN is applied with this ordering, one can obtain the hierarchical clustering at any level for different values of the density parameter. The key is to ensure that the clusters at different levels of the hierarchy are consistent with one another. One observation is that clusters defined over a lower value of e are completely contained in clusters defined over a higher value of ε, if the value of MinPts is not varied. Therefore, the data points are ordered based on the value of ε required in order to obtain MinPts in the ε-neighborhood. If the data points with smaller values of ε are processed first, then it is assured that higher density regions are always processed before lower density regions. This ensures that if the DBSCAN algorithm is used for different values of ε with this ordering, then a consistent result is obtained. Thus, the output of the OPTICS algorithm is not the cluster membership, but it is the order in which the data points are processed. We note that since the OPTICS algorithm shares so many characteristics with the DBSCAN algorithm, it is fairly easy to extend the OPTICS algorithm to the uncertain case using the same approach as was used for extending the DBSCAN algorithm. This is referred to as the FOPTICS algorithm. Note that one of the core concepts needed to order data points is to determine the value of e which is needed in order to obtain MinPts in the corresponding neighborhood. In the uncertain case, this value is defined probabilistically, and the corresponding expected values are used to order the data points. A different hierarachical clustering algorithm with the use of an information-theoretic approach was proposed in [30].
Neural Network Data Mining Clustering Optimization Algorithm
Published in IETE Journal of Research, 2021
The hierarchical method contains another commonly used clustering idea, which is to decompose the data set to be processed layer by layer until the specified conditions are met. Density-based clustering algorithms use whether the density domains are connected to determine whether the data objects belong to the same type of cluster. This type of algorithm is not sensitive to noise points. It overcomes the shortcomings of some division methods that can only find spherical clusters and can identify arbitrary shapes class clusters. To improve the shortcomings of the DBSCAN algorithm that is sensitive to the initial parameters, the OPTICS algorithm that extracts clusters based on the information in the ordered object list after the data objects are sorted is also one of the representative algorithms. The DENCLUE-IM algorithm significantly improves the efficiency of the algorithm by modifying the calculation steps based on the hill-climbing algorithm and achieves a balance between clustering quality and efficiency.
Optimization of energy consumption in wireless sensor networks using density-based clustering algorithm
Published in International Journal of Computers and Applications, 2021
Hasan Jafari, Mousa Nazari, Shahaboddin Shamshirband
The runtime of the OPTICS algorithm is very close to DBSCAN algorithm. Measuring the performance of this algorithm using various datasets revealed that the runtime of the OPTICS algorithm is fixed at 1.6 times of the DBSCAN algorithm. The runtime equals to O(n*E-neighborhood), which is required to obtain the E-neighborhood or the neighborhood radius of an object in the worst case of the scan of the entire database in which case, the total runtime equals to O(n2). However, using search tree algorithm, when the queries are used with reachability methods such as R*-tree, X-tree or M-tree which in the worst case of the tree height form n object, their runtime equals to O(logn). In this case, the runtime is reduced to O(nLogn). Further, if E-neighborhood is directly reachable or the objects are organized as grid, the runtime is even reduced to O(n) because the complexity of a local enquiry in the grid is O(1) [8].