Text Processing packages

Showing projects tagged as Specific Formats Processing and Text Processing

  • WeasyPrint

    8.3 8.6 L1 Python
    The awesome document factory
  • PDFMiner

    8.3 0.0 L3 Python
    Python PDF Parser (Not actively maintained). Check out pdfminer.six.
  • Python-Markdown

    7.7 5.3 Python
    A Python implementation of John Gruber’s Markdown with Extension support.
  • PyMuPDF

    6.9 0.0 Python
    PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
  • markdown2

    6.8 7.4 Python
    markdown2: A fast and complete implementation of Markdown in Python
  • pdftabextract

    6.5 0.0 L3 Python
    A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
  • Mistune

    6.4 6.9 L4 Python
    A fast yet powerful Python Markdown parser with renderers and plugins.
  • pymorphy2

    4.8 0.0 Python
    Morphological analyzer / inflection engine for Russian and Ukrainian languages.
  • Construct

    4.5 5.3 Python
    Construct: Declarative data structures for python that allow symmetric parsing and building
  • mistletoe

    4.0 0.0 Python
    A fast, extensible and spec-compliant Markdown parser in pure Python.