All Versions
47
Latest Version
Avg Release Cycle
81 days
Latest Release
1156 days ago

Changelog History
Page 4

  • v3.3.1 Changes

    December 12, 2014

    ⏪ This version restores compatibility with Python 2.6.

  • v3.3.0 Changes

    August 16, 2014

    Heuristic changes:

    • Certain symbols are marked as "ending punctuation" that may naturally occur after letters. When they follow an accented capital letter and look like mojibake, they will not be "fixed" without further evidence. An example is that "MARQUÉ…" will become "MARQUÉ...", and not "MARQUɅ".

    🆕 New features:

    • ftfy.explain_unicode is a diagnostic function that shows you what's going on in a Unicode string. It shows you a table with each code point in hexadecimal, its glyph, its name, and its Unicode category.

    • 🛠 ftfy.fixes.decode_escapes adds a feature missing from the standard library: it lets you decode a Unicode string with backslashed escape sequences in it (such as "\u2014") the same way that Python itself would.

    • 🚀 ftfy.streamtester is a release of the code that I use to test ftfy on an endless stream of real-world data from Twitter. With the new heuristics, the false positive rate of ftfy is about 1 per 6 million tweets. (See the "Accuracy" section of the documentation.)

    🗄 Deprecations:

    • 👍 Python 2.6 is no longer supported.

    • remove_unsafe_private_use is no longer needed in any current version of Python. This fixer will disappear in a later version of ftfy.

  • v3.2.0 Changes

    June 27, 2014
    • fix_line_breaks fixes three additional characters that are considered line breaks in some environments, such as Javascript, and Python's "codecs" library. These are all now replaced with \n:

      U+0085 , with alias "NEXT LINE" U+2028 LINE SEPARATOR U+2029 PARAGRAPH SEPARATOR

  • v3.1.3 Changes

    May 15, 2014
    • 🛠 Fix utf-8-variants so it never outputs surrogate codepoints, even on Python 2 where that would otherwise be possible.
  • v3.1.2 Changes

    January 29, 2014
    • 🛠 Fix bug in 3.1.1 where strings with backslashes in them could never be fixed
  • v3.1.1 Changes

    January 29, 2014
    • ➕ Add the ftfy.bad_codecs package, which registers new codecs that can decoding things that Python may otherwise refuse to decode:

      • utf-8-variants, which decodes CESU-8 and its Java lookalike
      • sloppy-windows-*, which decodes character-map encodings while treating unmapped characters as Latin-1
    • Simplify the code using ftfy.bad_codecs.

  • v3.0.6 Changes

    November 05, 2013
    • fix_entities can now be True, False, or 'auto'. The new case is True, which will decode all entities, even in text that already contains angle brackets. This may also be faster, because it doesn't have to check.
    • 🏗 build_data.py will refuse to run on Python < 3.3, to prevent building an inconsistent data file.
  • v3.0.5 Changes

    November 01, 2013
    • 🛠 Fix the arguments to fix_file, because they were totally wrong.
  • v3.0.4 Changes

    October 01, 2013
    • ⏪ Restore compatibility with Python 2.6.
  • v3.0.3 Changes

    September 09, 2013
    • 🛠 Fixed an ugly regular expression bug that prevented ftfy from importing on a narrow build of Python.