Linguistic packages

Showing projects tagged as Linguistic

  • Jieba

    9.8 3.1 L5 Python
    Chinese text segmentation.
  • gensim

    9.5 8.7 L3 Python
    Topic Modelling for Humans.
  • Pattern

    9.2 0.8 L2 Python
    A web mining module for the Python.
  • Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.
  • TextBlob

    9.0 5.0 L3 Python
    Providing a consistent API for diving into common NLP tasks.
  • Stanza

    8.4 9.1 Python
    The Stanford NLP Group's official Python library, supporting 60+ languages.
  • coala

    8.3 1.7 L4 Python
    Language independent and easily extendable code analysis application.
  • sumy

    7.2 6.1 L5 Python
    A module for automatic summarization of text documents and HTML pages.
  • Lark

    6.6 9.3 Python
    A modern parsing library for Python, implementing Earley & LALR(1) and an easy interface
  • polyglot

    6.4 1.5 Python
    Natural language pipeline supporting hundreds of languages.

    6.3 0.0 L3 Python
    Stand-alone language identification system.
  • TextDistance

    6.2 6.1 Python
    Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
  • aeneas

    6.2 0.8 L3 Python
    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
  • textacy

    6.2 8.6 L3 Python
    higher-level NLP built on Spacy
  • awesome-embedding-models

    6.1 0.0 Jupyter Notebook
    A curated list of awesome embedding models tutorials, projects and communities.
  • quepy

    6.0 0.0 L5 Python
    A python framework to transform natural language questions to queries in a database query language.
  • jellyfish

    5.5 6.4 Python
    A python library for doing approximate and phonetic matching of strings
  • pymorphy2

    4.7 5.5 Python
    Morphological analyzer / inflection engine for Russian and Ukrainian languages.
  • python-nameparser

    3.6 3.2 L2 Python
    Parsing human names into their individual components.

    2.2 0.0 L5 Python
    Spacing texts for CJK and alphanumerics.
  • Korean

    2.1 1.7 L4 Python
    A library for Korean morphology.
  • Python Left-Right Parser

    1.8 0.4 L4 Python
    Python Parser
  • trafilatura

    1.8 9.2 Python
    Web scraping library and command-line tool to download, extract (metadata, main text, comments), and convert the output
  • Charset Normalizer

    1.6 0.4 Python
    ๐Ÿ”Ž Like Chardet. ๐Ÿš€ Package for encoding & language detection. Charset detection.
  • htmldate

    1.0 8.0 Python
    Fast and robust date extraction from web pages, from the command-line or within Python