Sequential pattern mining – Knowledge and References

Explore chapters and articles related to this topic

Predictive Maintenance of Mining Machines Applying Advanced Data Analysis

Published in Ali Soofastaei, Data Analytics Applied to the Mining Industry, 2020

In the thesis published in Ref. [11], the Sequential Pattern Mining technique was used for analyzing big data collected from a North America mining company. The dataset presents records from eleven trucks for nine months. Sequential Pattern Mining is a data mining technique that searches for patterns that occur consecutively in a database or patterns that have an association with time or other values. The approach has demonstrated great applicability to uncover important patterns in sequential data. In the studied case, three failure codes were selected and, first, several patterns between the two same codes were identified. Second, a variety of patterns were uncovered in the last three and five shifts that anticipate the breakdown. Despite some improvement opportunities reported by the author, the prediction rate was more than 90% in the last five shift events.

Data Analysis Tools

View Chapter

Purchase Book

Published in Jim Goodell, Janet Kolodner, Learning Engineering Toolkit, 2023

Erin Czerwinski, Tanvi Domadia, Scotty D. Craig, Jim Goodell, Steve Ritter

The goal of mining is to discover features, patterns, correlations, or anomalies of a data set that are useful for decision-making or further analysis, for example, which features of an instructional design correlate to desired outcomes. (The word “discover” implies that mining is more of a method for science than engineering, but especially when combined with other techniques these methods can be a valuable part of making data sets useful for answering specific engineering questions.) Mining techniques include, but aren’t limited to, the following: Relationship Mining: The goal of relationship mining is to discover variables that are related to one another within sets of many variables. Relationship mining is also key to big data reduction, that is, reducing the number of variables and size of the data set while keeping the relevant information needed to answer the question.Correlation Mining is used to find substantial linear correlations between variables. Remember, though: Correlation doesn’t mean causation!Causal Mining is used to infer causality, as in x causes y.Association Rule Mining is used to find simple if / then rules in the data set. For example, if a learner does x, they’re likely do y. This is useful for finding unexpected connections and generating hypotheses.Sequential Pattern Mining is used for finding patterns over time. For example, if a person reads a book on learning engineering now, how likely is it that person will take a class on education data mining later?Network Analysis is used for finding connections and their relative strengths, such as in social networks.

Multi-faceted modelling for strip breakage in cold rolling using machine learning

View Article

Journal Information

Published in International Journal of Production Research, 2021

Zheyuan Chen, Ying Liu, Agustin Valera-Medina, Fiona Robinson, Michael Packianather

Sequential pattern mining is a widely researched topic. It refers to the mining of frequently occurring events in order as patterns. It was first introduced (Srikant and Agrawal 1996) in the Apriori family of algorithms. The algorithms perform pattern mining in sequences of itemsets (events) and find frequent patterns in the input. A large variety of algorithms, similar to the Apriori algorithms, have been introduced, such as Sequential PAttern Discovery using Equivalence classes (SPADE) (Zaki 2001), PrefixSpan (Pei et al. 2004) and Sequential PAttern Mining (SPAM) (Ayres et al. 2002). These algorithms all use the support measure to determine frequency. The support of a sequence is simply the proportion of entries in the data that it appears in frequency. The ability to address the complex data structure of sequences is what sets sequential pattern mining apart from standard data mining (Pinto et al. 2001). Sequential pattern mining can access and obtain information that may be hidden in the structure of a sequence. The collective behaviour and hidden relations between such data can contain decisive information (Bautista-Thompson and Brito-Guevara 2008).

Efficient learning algorithm for sparse subsequence pattern-based classification and applications to comparative animal trajectory data analysis

View Article

Journal Information

Published in Advanced Robotics, 2019

Takuto Sakuma, Kazuya Nishi, Kaoru Kishimoto, Kazuya Nakagawa, Masayuki Karasuyama, Yuta Umezu, Shinsuke Kajioka, Shuhei J. Yamazaki, Koutarou D. Kimura, Sakiko Matsumoto, Ken Yoda, Matasaburo Fukutomi, Hisashi Shidara, Hiroto Ogawa, Ichiro Takeuchi

The database of sequences is denoted by . We define the support of the pattern as where indicates the number of sequences that contain the pattern . The set of all patterns that appears or more times is called frequent sequential patterns and denoted as In the context of pattern mining, the threshold value is called the minimum support. A method that can find frequent sequential patterns is called a frequent sequential pattern mining method. For example, in Table 1, when , Since the number of possible patterns is quite large in general, it is often infeasible to actually count the supports of all possible patterns. To circumvent this difficulty, sequential pattern mining methods exploit the fact that the support of a pattern is always less than or equal to the supports of any of its subsequences. Consider two sequences and such that , i.e. is a subsequence of ; then, it is obvious that Equation (3) indicates that, when we consider a tree as inFigure 2, the support of the pattern in a node is always greater than or equal to its descendant node patterns, and less than or equal to its ancestor node patterns. This anti-monotonicity of the support in the tree can be exploited to find frequent sequential patterns. Namely, when we search over the tree, if the support of a node in the tree is already smaller than , we can skip searching over its subtree.