Charset Normalizer v2.0.0 Release NotesRelease Date: 2021-07-02 // almost 2 years ago
- 🚀 4x to 5 times faster than the previous 1.4.0 release. At least 2x faster than Chardet.
- Accent has been made on UTF-8 detection, should perform rather instantaneous.
- The backward compatibility with Chardet has been greatly improved. The legacy detect function returns an identical charset name whenever possible.
- The detection mechanism has been slightly improved, now Turkish content is detected correctly (most of the time)
- The program has been rewritten to ease the readability and maintainability. (+Using static typing)+
- utf_7 detection has been reinstated.
- 📦 This package no longer require anything when used with Python 3.5 (Dropped cached_property)
- ✂ Removed support for these languages: Catalan, Esperanto, Kazakh, Baque, Volapük, Azeri, Galician, Nynorsk, Macedonian, and Serbocroatian.
- 🚚 The exception hook on UnicodeDecodeError has been removed.
- Methods coherence_non_latin, w_counter, chaos_secondary_pass of the class CharsetMatch are now deprecated and scheduled for removal in v3.0
- The CLI output used the relative path of the file(s). Should be absolute.