Explore chapters and articles related to this topic
From frequency counts to contextualized word embeddings
Published in Uwe Engel, Anabel Quan-Haase, Sunny Xun Liu, Lars Lyberg, Handbook of Computational Social Science, Volume 2, 2021
Gregor Wiedemann, Cornelia Fedtke
The most basic form of computational semantic representation is also the most widely used in social science contexts: sequences of alphabet characters, referred to as a string, serve as query input for a matching procedure on some target text encoded as a string as well. Strings can be of arbitrary lengths and, thus, represent single words, multi-word units, phrases, sentences, or entire documents. For instance, a target document can be split at punctuation marks into sentences, and sentences can be split into isolated words at whitespace characters. The pattern matching algorithm can then check how often a query string, for example a keyword, occurs in a given target text. More complex extensions to this approach combine queries for individual words to word lists, so-called dictionaries, in which each term of a dictionary category is a representative for a more abstract concept (e.g. positive or negative sentiment). Dictionary terms can further be combined with certain rules (e.g. by AND, OR, NOT conditions) or regular expressions to match a desired observation in target texts. Among other things, regular expressions allow using wildcard characters and character classes (e.g. numbers or word characters) to look for. Such dictionaries are the basis for most automatic content analyses from the early document categorizations in media studies (Stone et al., 1966) to nowadays widely conducted sentiment analyses (e.g. Young & Soroka, 2012).
Prognostic modelling for industrial asset health management
Published in Safety and Reliability, 2022
Neda Gorjian Jolfaei, Raufdeen Rameezdeen, Nima Gorjian, Bo Jin, Christopher W. K. Chow
After formulating the review question, the next step is to locate relevant studies for the LR. Two electronic databases, Scopus, and Web of Science were selected as primary databases for the literature search. These databases were selected because they encompass an extensive collection of literature, are readily accessible. Both Scopus and Web of Science also offer advanced search capabilities which facilitate more targeted searches for precise location of studies. After the primary search from Scopus and Web of Science was completed, a secondary search was conducted using Google Scholar to identify any important studies which might have been missed out during the primary search. A set of keywords were developed based on the review question to capture those studies that discuss asset failure prognostic models. A search string was developed by combining the keywords using the Boolean operators “AND” and “OR” as follows: (“asset” OR “infra*” OR “plant”) AND (“failure” OR “fault”) AND (“diag*” OR “prog*” OR “monitor*” OR “assess*”). The asterisk (*) symbol was used as a wildcard character to broaden the scope of the search by capturing alternative endings for the keywords.
Continuance in online participation following the compromise of older adults’ identity information: a literature review
Published in Theoretical Issues in Ergonomics Science, 2018
Judy M. Watson, Paul M. Salmon, David Lacey, Don Kerr
The keywords used in the searches were as follows: online, adult*, people, identit*, cyber*, privac*, fraud*, theft, crim*, compromise*, old*, elder*, senior*, aged, aging and ageing. To ensure all derivations of each word was included in the search term a wildcard character option was used within the search term. The wildcard character takes the form of an asterisk (*) and can be added to a character string to substitute for any other character or characters resulting in all derivations of the word stem being included; for example, identit* signifies identity, identities etc. Articles centred on bullying, gambling, gaming, health, sexuality, stalking and younger adults were screened out of the search. A total of 5797 articles were returned with some indexed by multiple databases. The search and selection process followed the PRISMA framework (Moher et al. 2009). As the number of duplicates was small and more easily identified at the later stages of the process the framework was modified as shown in Figure 2.
Intelligent evaluation of test suites for developing efficient and reliable software
Published in International Journal of Parallel, Emergent and Distributed Systems, 2021
Masoud Mohammadian, Zafer Javed
A Wildcard query (Q5) allows searching for terms with missing parts. The standard wildcard characters are the ‘*’ and the ‘?’ symbols. For example, the Wildcard query ‘bal?’ matches terms like ‘ball,’ ‘bale,’ etc.