spaCy v2.1.7 Release Notes

Release Date: 2019-08-01 // over 4 years ago
  • ๐Ÿฑ โœจ New features and improvements

    • โž• Add Token.tensor and Span.tensor attributes.
    • ๐Ÿ‘Œ Support simple training format of (text, annotations) instead of only (doc, gold) for nlp.evaluate.
    • โž• Add support for "lang_factory" setting in model meta.json (see #4031).
    • ๐Ÿ“ฆ Also support "requirements" in meta.json to define packages for setup's install_requires.
    • ๐Ÿ‘Œ Improve Pipe base class methods and make them less presumptuous.
    • ๐Ÿ‘Œ Improve Danish and Korean tokenization.
    • ๐Ÿ‘Œ Improve error messages when deserializing model fails.

    ๐Ÿฑ ๐Ÿ”ด Bug fixes

    • ๐Ÿ›  Fix issue #3669, #3962: Fix dependency copy in Span.as_doc that could cause segfault.
    • ๐Ÿ›  Fix issue #3968: Fix bug in per-entity scores.
    • ๐Ÿ›  Fix issue #4000: Improve entity linking API.
    • ๐Ÿ›  Fix issue #4022: Fix error when Korean text contains special characters.
    • ๐Ÿ›  Fix issue #4030: Handle edge case when calling TextCategorizer.predict with empty Doc.
    • ๐Ÿ›  Fix issue #4045: Correct Span.sent docs.
    • ๐Ÿ›  Fix issue #4048: Fix init-model command if there's no vocab.
    • ๐Ÿ›  Fix issue #4052: Improve per-type scoring of NER.
    • ๐Ÿ›  Fix issue #4054: Ensure the lang of nlp and nlp.vocab stay consistent.
    • ๐Ÿ›  Fix bugs in Token.similarity and Span.similarity when called via hook.

    ๐Ÿ“š ๐Ÿ“– Documentation and examples

    ๐Ÿ‘ฅ Contributors

    Thanks to @sorenlind, @pmbaumgartner, @svlandeg, @FallakAsad, @BreakBB, @adrianeboyd, @polm, @b1uec0in, @mdaudali and @ejarkm for the pull requests and contributions.