Apache Solr – Knowledge and References

Explore chapters and articles related to this topic

Semantically Linked Media for Interactive User-Centric Services

Published in Hassnaa Moustafa, Sherali Zeadally, Media Networks: Architectures, Applications, and Standards, 2016

Violeta Damjanovic, Thomas Kurz, Georg Güntner, Sebastian Schaffert, Lyndon Nixon

The LMF holds RDF data in a triple store that implements a Sesame-Sails repository. In parallel, it manages a full text index for metadata and textual content based on Apache SOLR.* Apache SOLR is an open source enterprise search platform with major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document handling, and geospatial search. This allows the complex graph-based queries to be issued via SPARQL, and allows for textual queries in SOLR search syntax, in such a way that provide client libraries for both indexes (e.g., SOLR Client Libraries and SPARQL Clients) to be used natively. In addition, we developed SOLR Shortcuts language that makes the SOLR search index to be dynamically adaptable and scalable.

Towards intelligent geospatial data discovery: a machine learning framework for search ranking

View Article

Journal Information

Published in International Journal of Digital Earth, 2018

Yongyao Jiang, Yun Li, Chaowei Yang, Fei Hu, Edward M. Armstrong, Thomas Huang, David Moroni, Lewis J. McGibbney, Christopher J. Finch

Some authors consider that the core search functionality of most existing geospatial data portals is powered by Apache Lucene, an open-source information retrieval library or products built upon Lucene such as Apache Solr or Elasticsearch (Li, Goodchild, and Raskin 2014). For example, NOAA’s OneStop project is based on Elasticsearch, and the search engine of PO.DAAC is developed using Solr. Lucene-based techniques use the Boolean model to find matching documents (e.g. data) and various similarity algorithms to calculate relevance (Gormley and Tong 2015). As one of the widely used similarity algorithms, the formula of the practical scoring function is described in the Appendix. Solely relying on the practical scoring function is insufficient for discovering the most applicable dataset out of a vast range of available geospatial datasets, as it only considers text content while the domain knowledge (e.g. spatial resolution and processing level) is ignored. Therefore, two questions need to be answered in order to address the ranking challenge of geospatial data discovery: (1) What features can represent users’ search preferences for geospatial data? (2) How can the ranking reach a balance of all these features?

A Review of Spatial Big Data Platforms, Opportunities, and Challenges

View Article

Journal Information

Published in IETE Journal of Education, 2020

Swapnil Shrivastava

Datastax is an enterprise-level implementation of Cassandra that uses Well Known Text (WKT) markup language to represent Point, LineString, and Polygon type data. Cassandra Query Language (CQL) supports creation and alteration of geospatial data. However, it doesn’t provide automatic support for spatial index and spatial search. Apache Solr and additional libraries like Java Topology Suite (JTS) are used for performing spatial queries such as intersects, is within and contains [19].