10

8

6

4

2


8.7

9.6

8.4

9.5

8.3

9.7

8.3
0.0

8.1

8.1

8.1

7.6

34 Specific Formats Processing packages and projects

  • PyPDF2

    8.7 9.6 L2 Python
    A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
  • WeasyPrint

    8.4 9.5 L1 Python
    The awesome document factory
  • InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
    Promo www.influxdata.com
    InfluxDB Logo
  • PyMuPDF

    8.3 9.7 Python
    PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
  • PDFMiner

    8.3 0.0 L3 Python
    DISCONTINUED. Python PDF Parser (Not actively maintained). Check out pdfminer.six.
  • csvkit

    8.1 8.1 L3 Python
    A suite of utilities for converting to and working with CSV, the king of tabular file formats.
  • python-docx

    8.1 7.6 L5 Python
    Create and modify Word documents with Python
  • tablib

    7.8 6.4 L4 Python
    Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
  • Python-Markdown

    7.7 7.5 Python
    A Python implementation of John Gruber’s Markdown with Extension support.
  • XlsxWriter

    7.5 8.6 L3 Python
    A Python module for creating Excel XLSX files.
  • Kaitai Struct

    7.3 6.4 Shell
    Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
  • xlwings

    7.1 8.9 L4 Python
    xlwings is a Python library that makes it easy to call Python from Excel and vice versa. It works with Excel on Windows and macOS as well as with Google Sheets and Excel on the web.
  • Camelot

    7.1 9.6 Python
    A Python library to extract tabular data from PDFs
  • python-pptx

    6.9 6.9 Python
    Create Open XML PowerPoint documents in Python
  • borb

    6.8 5.5 Python
    borb is a library for reading, creating and manipulating PDF files in python.
  • markdown2

    6.7 8.4 Python
    markdown2: A fast and complete implementation of Markdown in Python
  • unoconv

    6.7 0.0 Python
    DISCONTINUED. Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.
  • Mistune

    6.5 7.5 L4 Python
    A fast yet powerful Python Markdown parser with renderers and plugins.
  • pdftabextract

    6.4 0.0 L3 Python
    A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
  • docxtpl

    6.4 7.4 Python
    Use a docx as a jinja2 template
  • Kreuzberg

    5.4 9.4 Python
    A text extraction library supporting PDFs, images, office documents and more
  • xlwt

    5.4 0.0 L3 Python
    DISCONTINUED. Writing and reading data and formatting information from Excel files.
  • pyexcel

    5.0 8.9 L5 Python
    Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files
  • pymorphy2

    4.9 0.0 Python
    Morphological analyzer / inflection engine for Russian and Ukrainian languages.
  • openpyxl

    4.4 -
    A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.
  • mistletoe

    4.4 5.8 Python
    A fast, extensible and spec-compliant Markdown parser in pure Python.
  • unp

    3.4 0.0 L5 Python
    Unpacks things.
  • ReportLab

    3.4 -
    Allowing Rapid creation of rich PDF documents.
  • Meltano Singer SDK

    2.5 9.7 Python
    Write 70% less code by using the SDK to build custom extractors and loaders that adhere to the Singer standard: https://sdk.meltano.com
  • vcspull

    2.5 9.1 L4 Python
    🔄 Synchronize projects via yaml/json manifest. Built using `libvcs`.
  • Marmir

    2.4 0.0 L4 Python
    Python powered spreadsheets
  • PyYAML

    2.3 -
    YAML implementations for Python.
  • libvcs

    1.5 9.3 L3 Python
    ⚙️ Lite, typed, pythonic utilities for git, svn, mercurial, etc.
  • Python Schema Matching by XGboost and Sentence-Transformers

    1.3 3.0 Python
    A python tool using XGboost and sentence-transformers to perform schema matching task on tables.
  • relatorio

    -
    Templating OpenDocument files.

Add another 'Specific Formats Processing' Package