Changelog History
Page 1
-
v6.1.2 Changes
February 17, 2022- โ Added type information for
guess_bytes
.
- โ Added type information for
-
v6.1.1 Changes
February 09, 2022โก๏ธ Updated the heuristic to fix the letter ร in UTF-8/MacRoman mojibake, which had regressed since version 5.6.
๐ Packaging fixes to pyproject.toml.
-
v6.1 Changes
February 09, 2022โก๏ธ Updated the heuristic to fix the letter ร with more confidence.
๐ Fixed type annotations and added py.typed.
๐ฆ ftfy is packaged using Poetry now, and wheels are created and uploaded to PyPI.
-
v6.0.3 Changes
May 14, 2021๐ Allow the keyword argument
fix_entities
as a deprecated alias forunescape_html
, raising a warning.ftfy.formatting
functions now disregard ANSI terminal escapes when calculating text width.
-
v6.0.2 Changes
May 04, 2021๐ This version is purely a cosmetic change, updating the maintainer's e-mail โ address and the project's canonical location on GitHub.
-
v6.0.1 Changes
April 12, 2021The
remove_terminal_escapes
step was accidentally not being used. This version restores it.Specified in setup.py that ftfy 6 requires Python 3.6 or later.
๐ Use a lighter link color when the docs are viewed in dark mode.
-
v6.0 Changes
April 02, 2021New function:
ftfy.fix_and_explain()
can describe all the transformations that happen when fixing a string. This is similar to whatftfy.fixes.fix_encoding_and_explain()
did in previous versions, but it can fix more than the encoding.fix_and_explain()
andfix_encoding_and_explain()
are now in the top-level ftfy module.๐ Changed the heuristic entirely. ftfy no longer needs to categorize every Unicode character, but only characters that are expected to appear in mojibake.
๐ Because of the new heuristic, ftfy will no longer have to release a new version for every new version of Unicode. It should also run faster and use less RAM when imported.
The heuristic
ftfy.badness.is_bad(text)
can be used to determine whether there appears to be mojibake in a string. Some users were already using the old functionsequence_weirdness()
for that, but this one is actually designed for that purpose.Instead of a pile of named keyword arguments, ftfy functions now take in a TextFixerConfig object. The keyword arguments still work, and become settings that override the defaults in TextFixerConfig.
โ Added support for UTF-8 mixups with Windows-1253 and Windows-1254.
๐ Overhauled the documentation: https://ftfy.readthedocs.org
-
v5.9 Changes
February 10, 2021This version is brought to you by the letter ร and the number 0xC3.
๐ Tweaked the heuristic to decode, for example, "ร " as the letter "ร " more often.
This combines with the non-breaking-space fixer to decode "ร " as "ร " as well. However, in many cases, the text " ร " was intended to be " ร ", preserving the space -- the underlying mojibake had two spaces after it, but the Web coalesced them into one. We detect this case based on common French and Portuguese words, and preserve the space when it appears intended.
Thanks to @zehavoc for bringing to my attention how common this case is.
- โก๏ธ Updated the data file of Unicode character categories to Unicode 13, as used in Python 3.9. (No matter what version of Python you're on, ftfy uses the same data.)
-
v5.8 Changes
July 17, 2020๐ Improved detection of UTF-8 mojibake of Greek, Cyrillic, Hebrew, and Arabic scripts.
๐ Fixed the undeclared dependency on setuptools by removing the use of
pkg_resources
.
-
v5.7 Changes
February 18, 2020โก๏ธ Updated the data file of Unicode character categories to Unicode 12.1, as used in Python 3.8. (No matter what version of Python you're on, ftfy uses the same data.)
Corrected an omission where short sequences involving the ACUTE ACCENT character were not being fixed.