All Versions
7
Latest Version
Avg Release Cycle
-
Latest Release
-
Changelog History
Changelog History
-
v0.5.0 Changes
- faster, more efficient code
- โฌ๏ธ dropped support for Python 3.5
-
v0.4.0 Changes
- ๐ new languages: Armenian, Greek, Macedonian, Norwegian (Bokmรฅl), and Polish
- language data reviewed for: Dutch, Finnish, German, Hungarian, Latin, Russian, and Swedish
- ๐ Urdu removed of language list due to issues with the data
- โ add support for Python 3.10 and drop support for Python 3.4
- ๐ improved decomposition and tokenization algorithms
-
v0.3.0 Changes
- ๐ improved models and disambiguation
- ๐ improved tokenization
- extended rules for German
-
v0.2.2 Changes
- Work on decomposition rules
- Reviewed language data
- Cleaner code
-
v0.2.1 Changes
- ๐ Better decomposition into subwords by greedy algorithm
- First benchmarks and data-based corrections: German, French, English, Spanish
-
v0.2.0 Changes
- Languages added: Danish, Dutch, Finnish, Georgian, Indonesian, Latin, Latvian, Lithuanian, Luxembourgish, Turkish, Urdu
- ๐ Improved word pair coverage
- Tokenization functions added
- Limit greediness and range of potential candidates
-
v0.1.0 Changes
- ๐ First release on PyPI