ftfy v3.3.0 Release NotesRelease Date: 2014-08-16 // over 8 years ago
- Certain symbols are marked as "ending punctuation" that may naturally occur after letters. When they follow an accented capital letter and look like mojibake, they will not be "fixed" without further evidence. An example is that "MARQUÉ…" will become "MARQUÉ...", and not "MARQUɅ".
🆕 New features:
ftfy.explain_unicodeis a diagnostic function that shows you what's going on in a Unicode string. It shows you a table with each code point in hexadecimal, its glyph, its name, and its Unicode category.
ftfy.fixes.decode_escapesadds a feature missing from the standard library: it lets you decode a Unicode string with backslashed escape sequences in it (such as "\u2014") the same way that Python itself would.
ftfy.streamtesteris a release of the code that I use to test ftfy on an endless stream of real-world data from Twitter. With the new heuristics, the false positive rate of ftfy is about 1 per 6 million tweets. (See the "Accuracy" section of the documentation.)
👍 Python 2.6 is no longer supported.
remove_unsafe_private_useis no longer needed in any current version of Python. This fixer will disappear in a later version of ftfy.