Explore chapters and articles related to this topic
A Corpus Based Quantitative Analysis of Gurmukhi Script
Published in Ayodeji Olalekan Salau, Shruti Jain, Meenakshi Sood, Computational Intelligence and Data Sciences, 2022
Gurjot Singh Mahi, Amandeep Verma
The first preliminary analysis on Indian languages was done by Bharati et al. (2002). The researchers made statistical interpretation of ten Indian languages, which was limited to the frequency distribution examination. Mehta and Majumder (2016) performed a quantitative study on three Indo-Aryan languages, i.e., Hindi, Gujarati, and Bengali, to understand whether the stated languages hold power law or not. Other works by Lakshmi Priya and Manimannan (2014), Kumar et al. (2007), and Daud et al. (2017) used conventional statistical techniques to examine the statistical richness of natural languages of the Indian subcontinent. Jayaram and Vidya (2008) applied Zipf’s law on two Indo-Aryan (Hindi and Marathi) and two Dravidian (Kannada and Telugu) set of languages. The analysis of the pattern of occurrence of words in the Hindi language was performed by Pande and Dhami (2013). The same authors also published a work on the occurrence of characters in the Hindi language (Pande and Dhami, 2015).
Mining concepts of health responsibility using text mining and exploratory graph analysis
Published in Scandinavian Journal of Occupational Therapy, 2019
Sofia Kjellström, Hudson Golino
In the first round of text mining, 4,011 terms were extracted, with a sparsity of 99%. Sparsity can be defined as the degree of zero-entries in a dataset, so a dataset with a sparsity of 10% means that 90% of the cells have non-zero entries. In the first step of the text mining, the terms appeared in at least 1% of the interviews. To decrease the level of sparsity, terms that did not appear in at least 20% of the interviews were deleted, resulting in a final document term matrix with 41 terms (or words). The relative frequency of the words can be viewed in Figure 1. The distribution of the terms or words frequency shows a pattern that is very common in any given corpus of natural language, called Zipf’s law. This law states that the frequency that a word appears is inversely proportional to its rank [33].
Bibliometric and systemic review of the state of the art of occupational risk management in the construction industry
Published in International Journal of Occupational Safety and Ergonomics, 2023
Leonardo Ensslin, Alex Gonçalves, Sandra Rolim Ensslin, Ademar Dutra
According to Bradford law, the journal with the highest degree of significance in the BP and references was Safety Science, where 48% of the selected articles were concentrated. Analysis of keywords was completed by checking the network of occurrences in the articles. The most relevant ones were: construction safety; safety management; risk management; process safety; accident prevention; and construction. These keywords established the subject of the BP, according to Zipf’s law.