HTML packages

Showing projects tagged as Text Processing, Utilities, XML, and HTML

  • xhtml2pdf

    6.7 7.0 L1 Python
    A library for converting HTML into PDFs using ReportLab
  • trafilatura

    6.7 8.8 Python
    Python & command-line tool to gather text on the Web: Crawling & scraping, content extraction, metadata. TXT, Markdown, CSV & XML output.
  • aeneas

    6.4 0.0 L3 Python
    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
  • Data Extractor

    0.9 5.4 Python
    Combine XPath, CSS Selectors and JSONPath for Web data extracting.