Text Processing packages

Showing projects tagged as Text Processing

  • httpie

    9.9 6.4 L3 Python
    A command line HTTP client, a user-friendly cURL replacement.
  • HTTP Prompt

    8.9 2.0 L4 Python
    HTTPie + prompt_toolkit = an interactive command-line HTTP client featuring autocomplete and syntax highlighting
  • TextBlob

    8.8 4.8 L3 Python
    Providing a consistent API for diving into common NLP tasks.
  • fuzzywuzzy

    8.7 5.4 L4 Python
    Fuzzy String Matching.
  • coala

    8.4 6.6 L4 Python
    Language independent and easily extendable code analysis application.
  • Pattern

    7.9 0.0 L2 Python
    A web mining module for the Python.
  • WeasyPrint

    7.7 9.1 L1 Python
    WeasyPrint converts web documents (HTML with CSS, SVG, …) to PDF.
  • gensim

    7.4 8.3 L3 Python
    Topic Modelling for Humans.
  • Pygments

    7.3 -
    A generic syntax highlighter.
  • feedparser

    7.3 6.6 L3 Python
    Universal feed parser.
  • Python-Markdown

    7.0 7.0 HTML
    A Python implementation of John Gruber’s Markdown.
  • ftfy

    7.0 3.9 L4 Python
    Makes Unicode text less broken and more consistent automagically.
  • python-user-agents

    7.0 1.9 L4 Python
    Browser user agent parser.
  • 汉字拼音转换工具(Python 版)

    7.0 7.5 Python
    汉字拼音转换工具 Python 版(pypinyin)。
  • sqlparse

    6.8 7.4 L4 Python
    A non-validating SQL parser.
  • markdown2

    6.7 6.8 Python
    markdown2: A fast and complete implementation of Markdown in Python
  • xhtml2pdf

    6.6 4.3 L1 Python
    HTML/CSS to PDF converter.
  • Levenshtein

    6.4 0.0 L1 C
    Fast computation of Levenshtein distance and string similarity.
  • sumy

    6.3 5.4 L5 Python
    A module for automatic summarization of text documents and HTML pages.
  • python-readability

    6.3 2.2 HTML
    Fast Python port of arc90's readability tool.
  • xmltodict

    6.2 3.6 L4 Python
    Working with XML feel like you are working with JSON.
  • quepy

    6.0 0.0 L5 Python
    A python framework to transform natural language questions to queries in a database query language.
  • lxml

    6.0 8.6 L2 Python
    A very fast, easy-to-use and versatile library for handling HTML and XML.
  • Mistune

    6.0 4.8 L4 Python
    Fastest and full featured pure Python parsers of Markdown.
  • aeneas

    5.9 0.0 L3 Python
    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
  • python-slugify

    5.9 6.5 L4 Python
    A Python slugify library that translates unicode to ASCII.
  • awesome-embedding-models

    5.5 4.8 Jupyter Notebook
    A curated list of awesome embedding models tutorials, projects and communities.
  • Scrapely

    5.5 1.9 HTML
    A pure-python HTML screen-scraping library
  • Jinja2

    5.5 8.5 L3 Python
    A modern and designer friendly templating language.
  • pdftabextract

    5.4 0.0 L3 Python
    A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.