ftfy v4.0.0 Release NotesRelease Date: 2015-04-10 // almost 7 years ago
💥 Breaking changes:
0️⃣ The default normalization form is now NFC, not NFKC. NFKC replaces a large number of characters with 'equivalent' characters, and some of these replacements are useful, but some are not desirable to do by default.
fix_textfunction has some new options that perform more targeted operations that are part of NFKC normalization, such as
fix_character_width, without requiring hitting all your text with the huge mallet that is NFKC.
- If you were already using NFC normalization, or in general if you want to
preserve the spacing of CJK text, you should be sure to set
- If you were already using NFC normalization, or in general if you want to preserve the spacing of CJK text, you should be sure to set
remove_unsafe_private_useparameter has been removed entirely, after two versions of deprecation. The function name
fix_bad_encodingis also gone.
🆕 New features:
🛠 Fixers for strange new forms of mojibake, including particularly clear cases of mixed UTF-8 and Windows-1252.
🆕 New heuristics, so that ftfy can fix more stuff, while maintaining approximately zero false positives.
The command-line tool trusts you to know what encoding your input is in, and assumes UTF-8 by default. You can still tell it to guess with the
🔧 The command-line tool can be configured with options, and can be used as a pipe.
Recognizes characters that are new in Unicode 7.0, as well as emoji from Unicode 8.0+ that may already be in use on iOS.
fix_text_encodingis being renamed again, for conciseness and consistency. It's now simply called
fix_encoding. The name
fix_text_encodingis available but emits a warning.
🗄 Pending deprecations:
👍 Python 2.6 support is largely coincidental.
📌 Python 2.7 support is on notice. If you use Python 2, be sure to pin a version of ftfy less than 5.0 in your requirements.