All Versions
Latest Version
Avg Release Cycle
200 days
Latest Release
1594 days ago

Changelog History

  • v1.6.3 Changes

    July 31, 2019

    ๐Ÿ›  fix the msg parser and update the Travis CI build

  • v1.6.2 Changes

    July 16, 2019

    โšก๏ธ update dependencies and make pocketsphinx optional

  • v1.6.1 Changes

    June 17, 2017

    ๐Ÿ“š documentation build fixes

  • v1.6.0 Changes

    April 03, 2017

    ๐Ÿ“œ psv/tsv parsers, user-provided filename extensions, audio parsing with pocketsphinx, and several other bug fixes

  • v1.5.0 Changes

    November 15, 2016

    python 3 compatability, improved docx extraction, improved image extraction, and more.

  • v1.4.0 Changes

    October 10, 2015

    ๐Ÿฑ pdf layout preservation, extensionless file support, and several ๐Ÿ› fixes

  • v1.3.0 Changes

    June 23, 2015

    โž• Added .rtf and .msg support

  • v1.2.0 Changes

    January 31, 2015

    ๐Ÿ“œ Includes support for tiff files and a new --option/-O command line option to pass in arbitrary keyword arguments to parsers, like the language for tesseract OCR

  • v1.1.0 Changes

    October 03, 2014

    ๐Ÿ‘Œ support for a variety of formats, including audio (.wav, .mp3, .ogg), csv, scanned pdfs, and htm plus various bug fixes and internal improvements.

  • v1.0.0 Changes

    August 25, 2014

    ๐Ÿš€ Bump in major release comes from a standardization of the byte-string output of textract. This also includes support for spreadsheets (.xls, .xlsx) and e-publications (.epub)