spaCy v2.1.5 release notes (2019-07-12)

« Changelog History

spaCy v2.1.5 Release Notes

Release Date: 2019-07-12 // almost 5 years ago

🍱 ✨ New features and improvements
- 🆕 NEW: Base language data for Marathi and Korean (via mecab-ko, mecab-ko-dic and natto-py).
- 👌 Improve language data for Lithuanian, Spanish, Kannada, French, Norwegian and Hindi.
- ➕ Add evaluation metrics per entity type.
- ➕ Add resume logic to spacy pretrain.
- ➕ Add optional id property to EntityRuler patterns.
- 👍 Better introspection and IDE automcomplete for custom extension attributes.
- 📄 Make Doc.is_sentenced always return True for single-token docs.
🍱 🔴 Bug fixes
- 🛠 Fix issue #3490: Add evaluation metrics per entity type to Scorer.
- 🛠 Fix issue #3526: Serialize EntityRuler settings correctly.
- 🛠 Fix issue #3558: Improve E024 error message for incorrect GoldParse.
- 🛠 Fix issue #3611: Fix bug when setting ngram parameter in text classifier.
- 🛠 Fix issue #3625: Improve default punctuation rules for Hindi.
- 🛠 Fix issue #3707: Improve introspection of custom attributes.
- 🛠 Fix issue #3737: Check if component is callable in Language.replace_pipe.
- 🛠 Fix issue #3743: Fix documentation of lex_id.
- 🛠 Fix issue #3749: Change vector training script to work with latest Gensim.
- 🛠 Fix issue #3762, #3934: Make Doc.is_sentenced default to True for single-token Docs.
- 🛠 Fix issue #3802: Fix typo in docs example.
- 🛠 Fix issue #3811: Fix type of --seed option in spacy pretrain.
- 🛠 Fix issue #3822: Allow passing PhraseMatcher arguments to EntityRuler.
- 🛠 Fix issue #3839: Ensure the Matcher returns correct match IDs when used with operators.
- 🛠 Fix issue #3840: Improve error messages in spacy pretrain.
- 🛠 Fix issue #3853: Rename vectors if multiple models are loaded to prevent clashes.
- 🛠 Fix issue #3859: Update pretrain to prevent unintended overwriting of weight files.
- 🛠 Fix issue #3862: Fix matcher callback example.
- 🛠 Fix issue #3868: Add "v.s." to English tokenizer exceptions.
- 🛠 Fix issue #3869: Make Doc.count_by work as expected.
- 🛠 Fix issue #3880: Fix unflatten padding in Thinc when last element is empty.
- 🛠 Fix issue #3882: Exclude user_data when copying doc in displaCy.
- 🛠 Fix issue #3892: Update Tokenizer initialization docs.
- 🛠 Fix issue #3912: Make text classifier raise more friendly errors.
📚 📖 Documentation and examples
- Add documentation for Scorer, Language.evaluate and gold.docs_to_json.
- 🛠 Fix various typos and inconsistencies.
👥 Contributors

Thanks to @BreakBB, @ujwal-narayan, @estr4ng7d, @maknotavailable, @ramananbalakrishnan, @nipunsadvilkar, @NirantK, @munozbravo, @intrafindBreno, @Azagh3l, @jarib, @tokestermw, @polm, @skrcode, @kabirkhan, @demongolem, @elbaulp, @clarus, @BramVanroy, @rokasramas, @askhogan, @khellan, @kognate, @cedar101 and @yash1994 for the pull requests and contributions.

spaCy v2.1.5

Version Release Notes from July 12, 2019 (almost 5 years ago)

« Changelog History

spaCy v2.1.5 Release Notes

🍱 ✨ New features and improvements

🍱 🔴 Bug fixes

📚 📖 Documentation and examples

👥 Contributors