Popularity
6.0
Growing
Activity
9.5
Growing
1,298
92
189

Code Quality Rank: L3
Programming language: Python
License: Apache License 2.0
Latest version: v0.9.1

textacy alternatives and similar packages

Based on the "Natural Language Processing" category

Do you think we are missing an alternative of textacy or a related project?

Add another 'Natural Language Processing' Package

README

textacy: NLP, before and after spaCy

textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals --- tokenization, part-of-speech tagging, dependency parsing, etc. --- delegated to another library, textacy focuses primarily on the tasks that come before and follow after.

build status current release version pypi version conda version

Features

  • Convenient entry points to working with one or many documents processed by spaCy, with functionality added via custom extensions and automatic language identification for applying the right spaCy pipeline
  • Variety of downloadable datasets with both text content and metadata, from Congressional speeches to historical literature to Reddit comments
  • Easy file I/O for streaming data to and from disk
  • Cleaning, normalization, and exploration of raw text — before processing
  • Flexible extraction of words, ngrams, noun chunks, entities, acronyms, key terms, and other elements of interest
  • Tokenization and vectorization of documents, with functionality for training, interpreting, and visualizing topic models
  • String, set, and document similarity comparison by a variety of metrics
  • Calculations for common text statistics, including Flesch-Kincaid Grade Level and multilingual Flesch Reading Ease

... and more!

Maintainer

Howdy, y'all. 👋