Explore chapters and articles related to this topic
Reviews Analysis of Apple Store Applications Using Supervised Machine Learning
Published in Rashmi Agrawal, Marcin Paprzycki, Neha Gupta, Big Data, IoT, and Machine Learning, 2020
Sarah Al Dakhil, Sahar Bayoumi
The authors Maleej et al. (2016) extended their approach by adding bigram and its combinations to utilised classification techniques, and by improving pre-processing phases and classification scripts. They argued that by the use of metadata combined with text classification and natural language pre-processing of the text, the classification precision rises significantly. They found that metadata alone results in poor classification accuracy. When combined with natural language processing, the classification precision got between 70% and 95% while the recall got between 80% and 90%. Therefore, text classification should be enhanced with metadata such as the tense of the text, the star rating, the sentiment score and the length. The results show that app reviews can be classified as to bug reports, feature requests, user experiences and ratings (praise or dispraise) with a high accuracy of between 70% and 97%. Complementary within-app analytics such as feature extraction, opinion mining and summarisation of the reviews, will make app store data more useful for decisions about software and engineering requirements.
Quantifying information
Published in Jun Wu, Rachel Wu, Yuxi Candice Wang, The Beauty of Mathematics in Computer Science, 2018
The more information we know about a random event, the less its uncertainty. This information can be directly related to the event, such as the Japanese cabinet decision, or it can be peripheral, like web pages’ quality. Peripheral information tells us about other random events, which in turn, relate back to our main question. For example, the statistical language models of previous chapters exhibit both types of information. The unigram model tells us information about the words themselves, while the bigram and higher-order models use contextual information to learn more about the words’ usages in a sentence. Using mathematics, we can rigorously prove that relevant information eliminates entropy, but to that end, we must first introduce conditional entropy.
Speech Signal Processing
Published in Richard C. Dorf, Circuits, Signals, and Speech and Image Processing, 2018
Jerry D. Gibson, Bo Wei, Hui Dong, Yariv Ephraim, Israel Cohen, Jesse W. Fussell, Lynn D. Wilcox, Marcia A. Bush
The language model or grammar of a recognition system defines the sequences of vocabulary items that are allowed. For simple tasks, deterministic finite-state grammars can be used to define all allowable word sequences. Typically, however, recognizers make use of stochastic grammars based on n-gram statistics (Jelinek, 1985). A bigram language model, for example, specifies the probability of a vocabulary item given the item which precedes it.
Improving the service quality of telecommunication companies using online customer and employee review analysis
Published in Quality Management Journal, 2020
Akhouri Amitanand Sinha, Suchithra Rajendran, Roland Paul Nazareth, Wonjae Lee, Shoriat Ullah
The bigram and trigram analysis used in this study is based on the discussion provided in Jurafsky and Martin (2014). A bigram and trigram are defined as the occurrence of two and three words in a sequence, respectively. In general, the appearance of words in a series can be referred to as -grams.