Topological data analysis – Knowledge and References

Explore chapters and articles related to this topic

Some Mathematical Properties of Networks for Big Data

Published in Yulei Wu, Fei Hu, Geyong Min, Albert Y. Zomaya, Big Data and Computational Intelligence in Networking, 2017

Marcello Trovati

Topological data analysis (TDA) is an emerging research field, which aims to apply theoretical approaches in topology to data science [3]. Since topology does not need to specifically consider a coordinate system to incorporate the dataset, it allows greater flexibility and the utilization of approaches that can be applied to a wider set of scenarios.

Enabling Smart Manufacturing with Artificial Intelligence and Big Data

View Chapter

Purchase Book

Published in Catalin I. Pruncu, Jamal Zbitou, Advanced Manufacturing Methods, 2023

Huu Du Nguyen, Kim Phuc Tran, Philippe Castagliola, Fadel M. Megahed

The monitoring production process is an important problem in smart manufacturing. Recently, machine learning algorithms have been proposed within Statistical Process Monitoring (SPM), which has been shown to effectively detect a variety of abnormal conditions. This approach converts the monitoring problem to an outlier detection problem or a supervised classification problem, which classifies future observations as either in- or out-of-control. When a huge amount of data are collected, it makes sense to use SPM techniques to analyze them in order to get accurate information about special causes of variation. The idea of using One-Class Support Vector Machines for detecting abnormality in Tran et al.73 can be developed for further application. The technology significantly reduces the time taken for testing and designing new prototypes and the duration for redesigning the existing models. In addition, with the rapid development of IIoT technologies, data are measured with a high frequency, high dimension, and a large variety which should not be treated straightforwardly. Therefore, it is necessary to develop new methods to be adapted to monitor these Big Data. The Topological Data Analysis (TDA) very recently emerged as a powerful tool to extract insights from high-dimensional, incomplete, and noisy data of varying types such as images, 3D scans, graphs, point clouds, and meshes. The core idea of TDA is to find the shape, the underlying structure of shapes, or relevant low-dimensional features of high-dimensional data. As a result, the problem of treating the complex structure and massive data is brought to a simpler problem. Among some recent research on TDA, the first successful application of TDA in the manufacturing systems domain is presented in Guo and Banerjee.74 In this study, the authors apply the Mapper algorithm, one of the tools of the TDA field, for predictive analysis of a chemical manufacturing process data set for yield prediction and a semiconductor etch process data set for fault detection. In general, there is still little research on this promising approach in the literature and further research needs to be conducted to discover the possible numerous applications to smart manufacturing. For example, deep learning algorithms such as LSTM and CNN for topological data should be developed to monitor smart manufacturing processes.

Multimodal Imaging Radiomics and Machine Learning

View Chapter

Purchase Book

Published in Ayman El-Baz, Jasjit S. Suri, Big Data in Multimodal Medical Imaging, 2019

Gengbo Liu, Youngho Seo, Debasis Mitra, Benjamin L. Franc

Machine Learning: The power of machine learning, both supervised and unsupervised, may be best harnessed when the training sample set is sufficiently large. Due to privacy and expense, obtaining a large number of human datasets is very challenging in medicine. However, across different laboratories and clinical facilities, data are being constantly gathered. If one can share these datasets, the use of machine learning may be as powerful as it is in many of its other application areas, such as computer vision, natural language processing, etc. This is the direction where we see that medical imaging research is making some pioneering steps [7,8,9]. Most recently, DeepLesion, released by NIH, contains nearly 10,600 publicly available CT scans to support the development of machine learning algorithms for medical applications [59]. Natural challenges, of course, are many, e.g., legally sharing data, standardizing data for uniform utilization by machine learning algorithms (as we have discussed in Section 1.2) and designing appropriate research questions over varieties of available data. All these challenges relate to the classical definition of Big Data, namely, volume, veracity and velocity, which pertain to the large size of data, diversity of data and rapid change of data, respectively. In the context of medical imaging these three characteristics respectively pose their own problems: (1) managing large data size over many imaging studies; (2) homogenizing data (to be accessed by individual algorithms or data processing procedures) over multiple studies acquired for different purposes and hypotheses, over different machines, possibly with multiple modalities involved; and (3) continuously increasing the size of the image database as new data are constantly added by contributors. Computer science is actively seeking solutions to the large data size problem. For example, new languages are being designed for Big Data (e.g., PlinyCompute [60]) and new architecture for deep learning as model parallelization [61]. We have discussed in this article how the radiology community is addressing the veracity of diversely acquired data. Finally, we believe these two solutions will have to adjust to the fact that the data will continuously grow and no static model for a fixed dataset will be sufficient. However, that is for the near future. In this context, the newly emerging field of topological data analysis (TDA) [62] may help in understanding the nature of a dataset. As an unsupervised machine learning model, this technique studies the topology of data points (images) in the underlying feature space. Instead of clustering the data points, TDA rather investigates more complex structures like connectivity between data. Such topological understanding may provide much better guidance to the supervised machine learning algorithms [63] or may even be embedded in the latter for a robust algorithm [64] to address the Big Data challenges.

A hybrid approach for transmission grid resilience assessment using reliability metrics and power system local network topology

View Article

Journal Information

Published in Sustainable and Resilient Infrastructure, 2021

Binghui Li, Dorcas Ofori-Boateng, Yulia R. Gel, Jie Zhang

Topological data analysis provides a mathematical foundation and systematic data science machinery for understanding the structure (shape) underlying the observed data. Therefore, TDA-based resilience metrics primarily reflect the structural response of transmission networks to various hazards. For instance, there is no direct relation between the number of -dimensional holes detected by TDA and power system reliability. However, the TDA summaries can be employed to track changes within local topology and geometry of power systems under attacks and failures. Recently, several studies (Islambekov et al., 2018; Ofori-Boateng et al., 2019; n.d.) analyze power grid resilience under node- and edge-based events via various persistent homology tools within the TDA framework, such as, for instance, the dynamics of Betti numbers and persistence diagrams.

Quantifying and Visualizing Uncertainty for Source Localisation in Electrocardiographic Imaging

View Article

Journal Information

Published in Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 2023

Dennis K. Njeru, Tushar M. Athawale, Jessie J. France, Chris R. Johnson

Topological data analysis is a powerful tool for understanding complex simulation datasets (Miller et al. 2006; Bremer et al. 2010). We propose visualisations of topological abstractions, specifically, critical points and Morse complexes (Edelsbrunner et al. 2001), of CGLS and PCGLS inverse solutions to gain insight into the likely source positions and their variations. Let function : be defined on a d-dimensional manifold, and let denote its gradient field. A point on a manifold is considered critical if . Given a Morse function defined on a d-dimensional manifold, i.e, a function with no flat regions, the Morse complex of decomposes the manifold into regions (referred to as cells) with uniform gradient behaviour. Figure 2c illustrates the Morse complex segmentation of the Ackley function shown in Figure 2a. In Figure 2c, nine Morse complex cells correspond to nine critical points of (local maxima) of the Ackley function. In our case, the Morse complexes segment the heart surface into cells, where gradients within a single cell terminate in a single local minimum associated with a cell (also known as an ascending manifold). Thus, local minima of ECGI solutions provide insight into the positions that have the smallest potential within their local neighbourhood (represented by the Morse complex cell), thus indicating potential source positions.