Suffix array – Knowledge and References

Explore chapters and articles related to this topic

Reranking the search results for lyric retrieval based on the songwriters′ specific usage of words

Published in Amir Hussain, Mirjana Ivanovic, Electronics, Communications and Networks IV, 2015

Kazuyuki Matsumoto, Manabu Sasayama, Qingmei Xiao, Akira Fujisawa, Minoru Yoshida, Kenji Kita

Our research aimed to increase the accuracy of lyric search by using a part of lyric (lyric snippet) as a search query, and proposed a method to solve the problem of the search query including errors. In this paper, we used a fast full-text retrieval based on Suffix Array (Manber 1990 ) as the retrieval algorithm. The character string search based on Suffix Array is a method that sorts out the suffixes extracted from the string of search target, according to dictionary order, and calculates quickly the appearance position of the search query by binary search. We chose the existing search library "sary2" that used the Suffix Array. Because "sary" enables fast full-text retrieval from a large amout of text data, it is suitable for the lyric retrieval task. We constructed the sary search index for lyric retrieval from the pairs of the lyric ID and the full text of the lyric. This lyric database with annotation of search index is called as. DBl. In the retrieval by Suffix Array, the perfectly matched character strings are retrieved. Therefore, there is a problem that if errors are included in the inputted fragment of the lyric, the correct lyrics cannot be retrieved.

Proactively managing clones inside an IDE: a systematic literature review

View Article

Journal Information

Published in International Journal of Computers and Applications, 2022

Sarveshwar Bharti, Hardeep Singh

Figure 9 presents the number of publications dedicated to the particular type of the clone detection approach used. There were 16 tools of the code clone detection that used AST (Abstract Syntax Tree) based clone detection approach. Five tools used index based clone detection technique. Four tools are token based, four are suffix tree based, one is suffix array based, three used linked editing technique and four tools are based on the clipboard operations. Literature listed 10 tools that used different clone detection approaches not listed before. This section answered the RQ3 identified in Section 4.

Maximum Exact Matches for High Throughput Genome Subsequence Assembly

View Article

Journal Information

Published in IETE Journal of Research, 2022

G. Raja, U. Srinivasulu Reddy

MEM [36] is a basic and important string matching technique used for solving biological problems such as long fragment sequence mapping. Then, a greedy algorithm [37] was tried to reassemble contiguous sequences based on suffix array indexing which is suitable for all base pair sequences. To the best of our knowledge, the previously proposed algorithms aim to solve only mapping problems. But in the current work, we proposed a novel modified MR-MEM method that considers coverage levels and different lengths of gap size in read sequences with due importance. This leads to better genome matching.