spaCy v2.1.7 Release Notes
Release Date: 2019-08-01 // over 4 years ago-
๐ฑ โจ New features and improvements
- โ Add
Token.tensor
andSpan.tensor
attributes. - ๐ Support simple training format of
(text, annotations)
instead of only(doc, gold)
fornlp.evaluate
. - โ Add support for
"lang_factory"
setting in modelmeta.json
(see #4031). - ๐ฆ Also support
"requirements"
inmeta.json
to define packages for setup'sinstall_requires
. - ๐ Improve
Pipe
base class methods and make them less presumptuous. - ๐ Improve Danish and Korean tokenization.
- ๐ Improve error messages when deserializing model fails.
๐ฑ ๐ด Bug fixes
- ๐ Fix issue #3669, #3962: Fix dependency copy in
Span.as_doc
that could cause segfault. - ๐ Fix issue #3968: Fix bug in per-entity scores.
- ๐ Fix issue #4000: Improve entity linking API.
- ๐ Fix issue #4022: Fix error when Korean text contains special characters.
- ๐ Fix issue #4030: Handle edge case when calling
TextCategorizer.predict
with emptyDoc
. - ๐ Fix issue #4045: Correct
Span.sent
docs. - ๐ Fix issue #4048: Fix
init-model
command if there's no vocab. - ๐ Fix issue #4052: Improve per-type scoring of NER.
- ๐ Fix issue #4054: Ensure the
lang
ofnlp
andnlp.vocab
stay consistent. - ๐ Fix bugs in
Token.similarity
andSpan.similarity
when called via hook.
๐ ๐ Documentation and examples
- โ Add documentation for
gold.align
helper. - โ Add more explicit section on processing text.
- ๐ Improve documentation on disabling pipeline components.
- ๐ Fix various typos and inconsistencies.
๐ฅ Contributors
Thanks to @sorenlind, @pmbaumgartner, @svlandeg, @FallakAsad, @BreakBB, @adrianeboyd, @polm, @b1uec0in, @mdaudali and @ejarkm for the pull requests and contributions.
- โ Add