All Versions
47
Latest Version
Avg Release Cycle
81 days
Latest Release
797 days ago

Changelog History
Page 2

  • v5.6 Changes

    August 07, 2019
    • ๐Ÿ‘ The unescape_html function now supports all the HTML5 entities that appear in html.entities.html5, including those with long names such as ˝.

    • Unescaping of numeric HTML entities now uses the standard library's html.unescape, making edge cases consistent.

    (The reason we don't run html.unescape on all text is that it's not always appropriate to apply, and can lead to false positive fixes. The text "This&NotThat" should not have "&Not" replaced by a symbol, as html.unescape would do.)

    • ๐Ÿ‘ On top of Python's support for HTML5 entities, ftfy will also convert HTML escapes of common Latin capital letters that are (nonstandardly) written in all caps, such as Ñ for ร‘.
  • v5.5.1 Changes

    September 14, 2018
    • โž• Added Python 3.7 support.

    • โšก๏ธ Updated the data file of Unicode character categories to Unicode 11, as used in Python 3.7.0. (No matter what version of Python you're on, ftfy uses the same data.)

  • v5.5 Changes

    September 06, 2018
    • Recent versions have emphasized making a reasonable attempt to fix short, common mojibake sequences, such as รƒยป. In this version, we've expanded the heuristics to recognize these sequences in MacRoman as well as Windows-125x encodings.

    • ๐Ÿ A related rule for fixing isolated Windows-1252/UTF-8 mixups, even when they were inconsistent with the rest of the string, claimed to work on Latin-1/UTF-8 mixups as well, but in practice it didn't. We've made the rule more robust.

    • ๐Ÿ›  Fixed a failure when testing the CLI on Windows.

    • โœ‚ Removed the pytest-runner invocation from setup.py, as it created complex dependencies that would stop setup.py from working in some environments. The pytest command still works fine. pytest-runner is just too clever.

  • v5.4.1 Changes

    June 14, 2018
    • ๐Ÿ›  Fixed a bug in the setup.py metadata.

    This bug was causing ftfy, a package that fixes encoding mismatches, to not install in some environments due to an encoding mismatch. (We were really putting the "meta" in "metadata" here.)

  • v5.4 Changes

    June 01, 2018
    • ftfy was still too conservative about fixing short mojibake sequences, such as "aoรƒยปt" -> "aoรปt", when the broken version contained punctuation such as curly or angle quotation marks.

    The new heuristic observes in some cases that, even if quotation marks are expected to appear next to letters, it is strange to have an accented capital A before the quotation mark and more letters after the quotation mark.

    • ๐Ÿ“‡ Provides better metadata for the new PyPI.

    • โœ… Switched from nosetests to pytest.

  • v5.3 Changes

    January 25, 2018
    • A heuristic has been too conservative since version 4.2, causing a regression compared to previous versions: ftfy would fail to fix mojibake of common characters such as รก when seen in isolation. A new heuristic now makes it possible to fix more of these common cases with less evidence.
  • v5.2 Changes

    November 27, 2017
    • The command-line tool will not accept the same filename as its input and output. (Previously, this would write a zero-length file.)

    • The uncurl_quotes fixer, which replaces curly quotes with straight quotes, now also replaces MODIFIER LETTER APOSTROPHE.

    • Codepoints that contain two Latin characters crammed together for legacy encoding reasons are replaced by those two separate characters, even in NFC mode. We formerly did this just with ligatures such as ๏ฌ and ฤฒ, but now this includes the Afrikaans digraph ล‰ and Serbian/Croatian digraphs such as ว†.

  • v5.1.1 Changes

    May 15, 2017

    ๐Ÿš€ These releases fix two unrelated problems with the tests, one in each version.

    • โœ… v5.1.1: fixed the CLI tests (which are new in v5) so that they pass on Windows, as long as the Python output encoding is UTF-8.

    • v4.4.3: added the # coding: utf-8 declaration to two files that were missing it, so that tests can run on Python 2.

  • v5.1 Changes

    April 07, 2017
    • โœ‚ Removed the dependency on html5lib by dropping support for Python 3.2.

    We previously used the dictionary html5lib.constants.entities to decode HTML entities. In Python 3.3 and later, that exact dictionary is now in the standard library as html.entities.html5.

    • ๐Ÿšš Moved many test cases about how particular text should be fixed into test_cases.json, which may ease porting to other languages.

    The functionality of this version remains the same as 5.0.2 and 4.4.2.

  • v5.0.2 Changes

    March 21, 2017

    โž• Added a MANIFEST.in that puts files such as the license file and this ๐Ÿ”„ changelog inside the source distribution.