Explore chapters and articles related to this topic
Natural Language Processing
Published in Subasish Das, Artificial Intelligence in Highway Safety, 2023
In information retrieval approaches, it is presumed that keywords denote condensed information from the documents. Keyword extraction utilizes a NLP method to recognize particular words or terms; this method is combined with supervised or unsupervised machine learning algorithms. Moreover, calculations on the co-occurrence of certain terms and phrases would be a point of interest in various research. As an example, a high frequency of ‘congestion,’ without any co-occurrence, wouldn’t always suggest the nature of the document’s particular interest. If the use of the term ‘congestion’ with another term ‘minimal’ is high, it would signify a different nature of the document. Corpus denotes a collection of text documents in text mining; it is an abstract concept, and several applications can exist in parallel. After developing a corpus, users are able to easily modify the documents in it: stemming, stop word removal, numbers, particular parts of speech, and redundant words are all examples of this. Figure 65 shows the flowchart of the developed Twitter mining approach.
Big data text mining in the financial sector
Published in Noura Metawa, Mohamed Elhoseny, Aboul Ella Hassanien, M. Kabir Hassan, Expert Systems in Finance, 2019
Mirjana Pejić Bach, Živko Krstić, Sanja Seljan
With new technologies and analysis in recent times, and especially in case of big data analytics with vast volumes of new data coming from different sources, there is a need for keyword extraction. Keyword extraction plays an important role in the financial sector. It was used in simple form in the previous example when a list of keywords was needed in order to extract related comments and articles from an external source. More complex, sophisticated usage would be to use automatic keyword extraction (Hasan and Ng, 2014). This field gained huge interest in the past several years, since volumes of data are growing and every document or comment cannot be read sequentially. The goal is to extract a “sequence of words”, called n-grams, through semi-automated process. However, this process does require manual validation and comparison with a reference model (i.e. “gold standard”) in order to assess the quality of the tool. Quality of terminology has gained importance regarding costs, user perceptions and customer satisfaction. For these reasons, various metrics are used in order to estimate the quality of automatically extracted terminology (Seljan et al., 2013; 2014; 2017).
Fuzzy Systems in Medicine and Healthcare
Published in Ashish Mishra, G. Suseendran, Trung-Nghia Phung, Soft Computing Applications and Techniques in Healthcare, 2020
Deepak K. Sharma, Sakshi, Kartik Singhal
Fuzzy logic finds its applications in the easy and extensive study of medical databases using the rough set theory and fuzzy logic. The use of such an approach not only removes the ambiguity in the medical databases but also provides a scope of improved interpretation of data, leading to a better study of the various trends related to the changing nature of the disease in recent times. Information retrieval has also become easier with the employment of keyphrase and keyword extraction techniques to improve upon the vagueness of the various existing retrieval systems.
Finding Experts in Community Question Answering System Using Trie String Matching Algorithm with Domain Knowledge
Published in IETE Journal of Research, 2023
The initial step in any kind of Natural Language Processing (NLP) application is keyword extraction. The most widely used keyword extraction algorithms are (i) TF-IDF (Term Frequency-Inverse Document Frequency), (ii) Text rank, and (iii) RAKE. In this work, the RAKE algorithm is applied to the keyword extraction process. RAKE is a domain-independent keyword extraction algorithm in NLP that helps to handle an application to the dynamic collection. The RAKE algorithm is used to extract the keywords from the user profile for the expert recommendation task on the CQA website. The RAKE algorithm works efficiently by eliminating stop words and phrase delimiters [37]. Hence, the RAKE algorithm is adopted to extract the keywords from the question.