Inverted index – Knowledge and References

Explore chapters and articles related to this topic

Auction-Based Advertising Technologies

Published in Peng Liu, Wang Chao, Computational Advertising, 2020

In the auction-based advertising market that accommodates a large number of small and medium-sized advertisers, the complex targeting conditions present new requirements for the retrieval technology. Inverted index is the key technology of search engine, and it is also used for ad retrieval. But ad retrieval has its own unique features and requirements. When using the basic inverted index technology for ad retrieval, two new problems need to be addressed: The combination of targeting conditions can be seen as a Boolean expression connected by “and-or” relation; in this document, which is apparently different from the BoW document for search engines; there is a space for optimizing the targeted retrieval performance.When there are rich contextual keywords or user tags, the query terms in ad retrieval may be quite long, or even composed of hundreds of keywords. In this case, the retrieval will be notably different from the search engine where the query usually contains one to four keywords. If you type 100 keywords into the search box concurrently, do you think the returned result is desirable?These differences have motivated the retrieval techniques in advertising to evolve beyond the basic inverted indexing. The above two problems are discussed in detail as follows.

Web IR: Information Retrieval on the Web

View Chapter

Purchase Book

Published in Akshi Kumar, Web Technology, 2018

Akshi Kumar

An inverted index allows quick lookup of document IDs with a particular word. The inverted index is built as follows, for each index term is associated with an inverted list: Contains lists of documents or lists of word occurrences in documents and other information.Each entry is called a posting.The part of the posting that refers to a specific document or location is called a pointer.Each document in the collection is given a unique number.Lists are usually document-ordered (sorted by document number).

Real-Time Search in the Sensor Internet

View Chapter

Purchase Book

Published in Ioanis Nikolaidis, Krzysztof Iniewski, Building Sensor Networks, 2017

Richard Mietz, Kay Römer

A forward index stores for every document the list of extracted words. So, the data structure is a list of mappings between a website and a word, grouped and sorted by the website. If the forward index is rearranged in the sense that it is sorted and grouped by words, a record-level inverted index is formed (see Figure 4.2). Typically, a word-level inverted index, i.e., a record-level inverted index with the position for each word in the document, is formed because it allows easier search for phrases and words with a certain proximity in the document. As billions of websites exist, the size of an inverted index can grow up to thousands of terabytes, as is the case for Google. Hence, compression is used to lessen storage at the cost of greater power consumption associated with the higher processor utilization for compression and decompression. The performance of update and delete operations of an inverted index is dependent on the underlying data structure. Usually, a tree data structure, such as a B-Tree, is used, resulting in O(log n) runtime. With millions of entries in an inverted index, these operations can be rather expensive with respect to time and disc accesses.

A temporal based approach for MapReduce distributed testing

View Article

Journal Information

Published in International Journal of Parallel, Emergent and Distributed Systems, 2021

Sara Hsaini, Salma Azzouzi, My El Hassan Charaf

An inverted index is defined as a data structure that stores a mapping from content (such as words or numbers) to its location in a document or a set of documents (Li). The inverted index data structure is a central component of a typical search engine aiming to optimise the speed of finding the documents in which a certain word occurs. The application is functionally quite simple and is well explained in the MapReduce literature. We might visualise the application as follow (Figure 11): Test execution