ftfy v6.0 Release Notes
Release Date: 2021-04-02 // about 3 years ago-
New function:
ftfy.fix_and_explain()
can describe all the transformations that happen when fixing a string. This is similar to whatftfy.fixes.fix_encoding_and_explain()
did in previous versions, but it can fix more than the encoding.fix_and_explain()
andfix_encoding_and_explain()
are now in the top-level ftfy module.๐ Changed the heuristic entirely. ftfy no longer needs to categorize every Unicode character, but only characters that are expected to appear in mojibake.
๐ Because of the new heuristic, ftfy will no longer have to release a new version for every new version of Unicode. It should also run faster and use less RAM when imported.
The heuristic
ftfy.badness.is_bad(text)
can be used to determine whether there appears to be mojibake in a string. Some users were already using the old functionsequence_weirdness()
for that, but this one is actually designed for that purpose.Instead of a pile of named keyword arguments, ftfy functions now take in a TextFixerConfig object. The keyword arguments still work, and become settings that override the defaults in TextFixerConfig.
โ Added support for UTF-8 mixups with Windows-1253 and Windows-1254.
๐ Overhauled the documentation: https://ftfy.readthedocs.org