Charset Normalizer v2.0.0 Release Notes
Release Date: 2021-07-02 // almost 2 years ago-
π Changed
- π 4x to 5 times faster than the previous 1.4.0 release. At least 2x faster than Chardet.
- Accent has been made on UTF-8 detection, should perform rather instantaneous.
- The backward compatibility with Chardet has been greatly improved. The legacy detect function returns an identical charset name whenever possible.
- The detection mechanism has been slightly improved, now Turkish content is detected correctly (most of the time)
- The program has been rewritten to ease the readability and maintainability. (+Using static typing)+
- utf_7 detection has been reinstated.
β Removed
- π¦ This package no longer require anything when used with Python 3.5 (Dropped cached_property)
- β Removed support for these languages: Catalan, Esperanto, Kazakh, Baque, VolapΓΌk, Azeri, Galician, Nynorsk, Macedonian, and Serbocroatian.
- π The exception hook on UnicodeDecodeError has been removed.
π Deprecated
- Methods coherence_non_latin, w_counter, chaos_secondary_pass of the class CharsetMatch are now deprecated and scheduled for removal in v3.0
π Fixed
- The CLI output used the relative path of the file(s). Should be absolute.