Spacy

Spacy

Natural Language Processing for Information Retrieval

Published in Anuradha D. Thakare, Shilpa Laddha, Ambika Pawar, Hybrid Intelligent Systems for Information Retrieval, 2023

Anuradha D. Thakare, Shilpa Laddha, Ambika Pawar

NLTK is the major platform used for building Python programs to perform natural language tasks. NLTK provides a suite of libraries of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning. spaCy is an open-source library for advanced NLP.

Data Science Skills and Graduate Certificates: A Quantitative Text Analysis

View Article

Journal Information

Published in Journal of Computer Information Systems, 2022

Haoqiang Jiang, Catherine Chen

This section describes the data-collection steps and data-process methods. The data were processed and analyzed in Python 3.7.6 using Jupyter development environment. spaCy is an open-source software library for natural language processing (NLP) in Python designed for large-scale information extraction tasks. In this study, the case-insensitive matching in PhraseMatcher, one of the classes in spaCy, was used to match phrases. To avoid selecting overlapping keywords, spaCy was used to parse phrases into tokens, and then the positioning index of the tokens within the document was used to avoid selecting overlapped keywords.

AdaBLEU: A Modified BLEU Score for Morphologically Rich Languages

View Article

Journal Information

Published in IETE Journal of Research, 2021

Shweta Chauhan, Philemon Daniel, Archita Mishra, Abhay Kumar

For the calculation of the AdaBLEU metric, we require extracting the POS tags and DP tags of the sentences. For this we have used the SpaCy library [26]. SpaCy is an open-source software library for natural language processing in Python.

Explore chapters and articles related to this topic

Natural Language Processing for Information Retrieval

Data Science Skills and Graduate Certificates: A Quantitative Text Analysis

AdaBLEU: A Modified BLEU Score for Morphologically Rich Languages