Text Processing packages

Showing projects tagged as Text Processing

  • Jieba

    9.8 0.0 L5 Python
    结巴中文分词
  • httpie

    9.7 7.0 L3 Python
    🥧 HTTPie CLI — modern, user-friendly command-line HTTP client for the API era. JSON support, colors, sessions, downloads, plugins & more.
  • gensim

    9.5 7.5 L3 Python
    Topic Modelling for Humans
  • pydantic

    9.4 9.8 Python
    Data validation using Python type hints
  • MkDocs

    9.4 9.0 L5 Python
    Project documentation with Markdown.
  • Pattern

    9.0 0.0 L2 Python
    Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
  • Jinja2

    9.0 7.0 L3 Python
    A very fast and expressive template engine.
  • TextBlob

    8.8 6.1 L3 Python
    Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
  • fuzzywuzzy

    8.8 0.0 L4 Python
    Fuzzy String Matching in Python
  • HTTP Prompt

    8.6 0.0 L4 Python
    An interactive command-line HTTP and API testing client built on top of HTTPie featuring autocomplete, syntax highlighting, and more. https://twitter.com/httpie
  • Sphinx

    8.6 9.8 L2 Python
    The Sphinx documentation generator
  • Stanza

    8.5 9.7 Python
    Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
  • PDFMiner

    8.3 0.0 L3 Python
    Python PDF Parser (Not actively maintained). Check out pdfminer.six.
  • WeasyPrint

    8.3 9.4 L1 Python
    The awesome document factory
  • xmltodict

    8.0 0.6 L4 Python
    Python module that makes working with XML feel like you are working with JSON
  • coala

    8.0 0.0 L4 Python
    coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.
  • 汉字拼音转换工具(Python 版)

    7.9 7.0 Python
    汉字转拼音(pypinyin)
  • Lark

    7.7 7.5 Python
    Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
  • Python-Markdown

    7.7 8.0 Python
    A Python implementation of John Gruber’s Markdown with Extension support.
  • sqlparse

    7.6 8.2 L4 Python
    A non-validating SQL parser module for Python
  • PyMuPDF

    7.5 9.8 Python
    PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
  • sumy

    7.4 6.7 L5 Python
    Module for automatic summarization of text documents and HTML pages.
  • Pygments

    7.3 -
    A generic syntax highlighter.
  • phonenumbers

    7.2 8.3 L4 Python
    Python port of Google's libphonenumber
  • ftfy

    7.1 5.7 L4 Python
    Fixes mojibake and other glitches in Unicode text, after the fact.
  • asciimatics

    7.1 7.6 L2 Python
    A cross platform package to do curses-like operations, plus higher level APIs and widgets to create text UIs and ASCII art animations
  • TextDistance

    6.9 7.0 Python
    📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
  • PLY

    6.9 1.0 L2 Python
    Python Lex-Yacc
  • lxml

    6.9 9.5 L2 Python
    The lxml XML toolkit for Python
  • percol

    6.9 0.0 L4 Python
    adds flavor of interactive filtering to the traditional pipe concept of UNIX shell