Linguistic packages

Showing projects tagged as Linguistic

  • Jieba

    9.8 0.0 L5 Python
    结巴中文分词
  • gensim

    9.5 9.0 L3 Python
    Topic Modelling for Humans
  • Pattern

    9.2 0.0 L2 Python
    Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
  • TextBlob

    8.9 4.4 L3 Python
    Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
  • Stanza

    8.5 9.5 Python
    Official Stanford NLP Python Library for Many Human Languages
  • coala

    8.2 4.4 L4 Python
    coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.
  • sumy

    7.2 4.3 L5 Python
    Module for automatic summarization of text documents and HTML pages.
  • Lark

    7.0 9.4 Python
    Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
  • TextDistance

    6.7 5.7 Python
    Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
  • polyglot

    6.5 0.0 Python
    Multilingual text (NLP) processing toolkit
  • langid.py

    6.4 0.0 L3 Python
    Stand-alone language identification system
  • textacy

    6.3 9.1 L3 Python
    NLP, before and after spaCy
  • aeneas

    6.3 0.0 L3 Python
    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
  • awesome-embedding-models

    6.0 0.0 Jupyter Notebook
    A curated list of awesome embedding models tutorials, projects and communities.
  • chardet

    5.9 6.3 L4 Python
    Python character encoding detector
  • quepy

    5.8 0.0 L5 Python
    A python framework to transform natural language questions to queries in a database query language.
  • jellyfish

    5.7 7.2 Python
    🎐 a python library for doing approximate and phonetic matching of strings.
  • pymorphy2

    4.6 0.0 Python
    Morphological analyzer / inflection engine for Russian and Ukrainian languages.
  • python-nameparser

    3.8 0.0 L2 Python
    A simple Python module for parsing human names into their individual components
  • trafilatura

    3.1 9.5 Python
    Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)
  • Charset Normalizer

    2.4 8.8 Python
    🔎 Like Chardet. 🚀 Package for encoding & language detection. Charset detection.
  • pangu.py

    2.3 0.0 L5 Python
    Paranoid text spacing in Python
  • Project Fluent

    2.1 1.9 Python
    Python implementation of Project Fluent
  • Korean

    2.0 0.0 L4 Python
    :warning: NOT MAINTAINED! Use https://github.com/what-studio/tossi instead. | A library for Korean morphology
  • Python Left-Right Parser

    1.7 2.7 L4 Python
    Python Parser
  • htmldate

    1.3 8.3 Python
    Fast and robust date extraction from web pages, with Python or on the command-line