ftfy v4.3.0 Release Notes
Release Date: 2016-12-29 // over 6 years ago-
ftfy has gotten by for four years without dependencies on other Python ๐ง libraries, but now we can spare ourselves some code and some maintenance burden by delegating certain tasks to other libraries that already solve them well. This version now depends on the
html5lib
andwcwidth
libraries.๐ Feature changes:
- The
remove_control_chars
fixer will now remove some non-ASCII control characters as well, such as deprecated Arabic control characters and byte-order marks. Bidirectional controls are still left as is.
This should have no impact on well-formed text, while cleaning up many characters that the Unicode Consortium deems "not suitable for markup" (see Unicode Technical Report #20).
The
unescape_html
fixer uses a more thorough list of HTML entities, which it imports fromhtml5lib
.ftfy.formatting
now useswcwidth
to compute the width that a string will occupy in a text console.
Heuristic changes:
- โก๏ธ Updated the data file of Unicode character categories to Unicode 9, as used in Python 3.6.0. (No matter what version of Python you're on, ftfy uses the same data.)
๐ Pending deprecations:
๐ The
remove_bom
option will become deprecated in 5.0, because it has been superseded byremove_control_chars
.ftfy 5.0 will remove the previously deprecated name
fix_text_encoding
. It was renamed tofix_encoding
in 4.0.ftfy 5.0 will require Python 3.2 or later, as planned. Python 2 users, please specify
ftfy < 5
in your dependencies if you haven't already.
- The