10

8

6

4

2


8.6

9.5

8.3
0.0

8.3

9.3

8.2

8.8

8.1

8.5

7.9

7.1

33 Specific Formats Processing packages and projects

  • PyPDF2

    8.6 9.5 L2 Python
    A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
  • PDFMiner

    8.3 0.0 L3 Python
    Python PDF Parser (Not actively maintained). Check out pdfminer.six.
  • Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
  • WeasyPrint

    8.3 9.3 L1 Python
    The awesome document factory
  • csvkit

    8.2 8.8 L3 Python
    A suite of utilities for converting to and working with CSV, the king of tabular file formats.
  • python-docx

    8.1 8.5 L5 Python
    Create and modify Word documents with Python
  • tablib

    7.9 7.1 L4 Python
    Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
  • Python-Markdown

    7.7 8.0 Python
    A Python implementation of John Gruber’s Markdown with Extension support.
  • XlsxWriter

    7.5 8.1 L3 Python
    A Python module for creating Excel XLSX files.
  • PyMuPDF

    7.4 9.8 Python
    PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
  • Kaitai Struct

    7.2 6.7 Shell
    Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
  • xlwings

    7.1 8.5 L4 Python
    xlwings is a Python library that makes it easy to call Python from Excel and vice versa. It works with Excel on Windows and macOS as well as with Google Sheets and Excel on the web.
  • borb

    6.8 5.4 Python
    borb is a library for reading, creating and manipulating PDF files in python.
  • markdown2

    6.7 8.5 Python
    markdown2: A fast and complete implementation of Markdown in Python
  • unoconv

    6.7 0.0 Python
    Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.
  • Camelot

    6.7 7.4 Python
    A Python library to extract tabular data from PDFs
  • python-pptx

    6.5 6.6 Python
    Create Open XML PowerPoint documents in Python
  • pdftabextract

    6.5 0.0 L3 Python
    A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
  • Mistune

    6.4 6.0 L4 Python
    A fast yet powerful Python Markdown parser with renderers and plugins.
  • docxtpl

    6.2 3.7 Python
    Use a docx as a jinja2 template
  • xlwt

    5.4 0.0 L3 Python
    Writing and reading data and formatting information from Excel files.
  • pyexcel

    5.0 0.0 L5 Python
    Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files
  • pymorphy2

    4.8 0.0 Python
    Morphological analyzer / inflection engine for Russian and Ukrainian languages.
  • openpyxl

    4.4 -
    A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.
  • mistletoe

    4.2 7.8 Python
    A fast, extensible and spec-compliant Markdown parser in pure Python.
  • ReportLab

    3.4 -
    Allowing Rapid creation of rich PDF documents.
  • unp

    3.3 0.0 L5 Python
    Unpacks things.
  • Marmir

    2.4 0.0 L4 Python
    Python powered spreadsheets
  • vcspull

    2.4 9.4 L4 Python
    🔄 Synchronize projects via yaml/json manifest. Built using `libvcs`.
  • PyYAML

    2.3 -
    YAML implementations for Python.
  • Meltano Singer SDK

    2.2 9.6 Python
    Write 70% less code by using the SDK to build custom extractors and loaders that adhere to the Singer standard: https://sdk.meltano.com
  • libvcs

    1.4 9.4 L3 Python
    ⚙️ Lite, typed, pythonic utilities for git, svn, mercurial, etc.
  • Python Schema Matching by XGboost and Sentence-Transformers

    1.1 3.5 Python
    A python tool using XGboost and sentence-transformers to perform schema matching task on tables.
  • relatorio

    -
    Templating OpenDocument files.

Add another 'Specific Formats Processing' Package