spaCy v2.1.4 Release Notes

Release Date: 2019-05-11 // almost 5 years ago
  • ๐Ÿฑ โœจ New features and improvements

    • ๐Ÿ†• NEW: util.filter_spans helper to filter duplicates and overlaps from a list of Span objects.
    • ๐Ÿ‘Œ Improve language data for Thai, Japanese, Indonesian and Dutch.
    • โž• Add --n-save-every to spacy pretrain and rename --nr-iter to --n-iter for consistency.
    • โž• Add --return-scores flag to spacy evaluate to return a dict.
    • โž• Add --n-early-stopping option to spacy train to define maximum number of iterations without dev accuracy improvements.

    ๐Ÿฑ ๐Ÿ”ด Bug fixes

    • ๐Ÿ›  Fix issue #3307: Fix symlink creation to show error on Windows.
    • ๐Ÿ›  Fix issue #3473: Fix GPU training for text classification.
    • ๐Ÿ›  Fix issue #3475: Change favicon.
    • ๐Ÿ›  Fix issue #3482: Add Estonian base support to documentation.
    • ๐Ÿ›  Fix issue #3484: Ensure lemmatization is always consistent between sessions.
    • ๐Ÿ›  Fix issue #3521: Add variations of contractions to English stop words.
    • ๐Ÿ›  Fix issue #3523: Make spacy convert correctly default to json.
    • ๐Ÿ›  Fix issue #3525, #3551, #3572: Fix problem that'd cause lemmas to not be lowercase.
    • ๐Ÿ›  Fix issue #3531: Don't make "settings" or "title" required in displaCy data.
    • ๐Ÿ›  Fix issue #3533: Remove non-existent example from docs.
    • Fix issue #3546: Make sure path in GoldParse. __del__ is a string.
    • ๐Ÿ›  Fix issue #3549: Ensure match pattern error isn't raised on empty errors list.
    • ๐Ÿ›  Fix issue #3561: Fix DependencyParser.predict docs.
    • ๐Ÿ›  Fix issue #3598: Allow jupyter=False to override Jupyter mode in displacy.
    • ๐Ÿ›  Fix issue #3620: Fix bug in .iob converter.
    • ๐Ÿ›  Fix issue #3628: Relax jsonschema pin.
    • ๐Ÿ›  Fix issue #3667: Fix offset bug in loading pre-trained word2vec.
    • ๐Ÿ›  Fix issue #3679: Update glossary to include missing labels in spacy.explain.
    • ๐Ÿ›  Fix issue #3680: Re-add missing universe README.
    • ๐Ÿ›  Fix issue #3681: Rewrite information extraction example to use Doc.retokenize.
    • ๐Ÿ›  Fix issue #3692: Fix return value in Language.update docs.
    • ๐Ÿ›  Fix issue #3694: Make "text" in spacy pretrain optional when "tokens" is provided.
    • ๐Ÿ›  Fix issue #3701: Improve Token.prob and Lexeme.prob docs.
    • ๐Ÿ›  Fix issue #3708: Fix error in regex matcher examples.
    • ๐Ÿ›  Fix issue #3713: Call rmtree and copytree with strings in spacy train.
    • ๐Ÿ›  Fix issue #3720: Add version tag to --base-model argument in spacy train docs.

    ๐Ÿ“š ๐Ÿ“– Documentation and examples

    ๐Ÿ‘ฅ Contributors

    Thanks to @svlandeg, @wannaphongcom, @Bharat123rox, @DuyguA, @SamuelLKane, @graus, @HiromuHota, @jeannefukumaru, @ivigamberdiev, @socool, @yvespeirsman, @lemontheme, @Dobita21, @w4nderlust, @pierremonico, @bryant1410, @celikomer, @xssChauhan, @kowaalczyk, @BreakBB, @fizban99, @tokestermw, @bjascob, @pickfire, @yaph, @amitness, @henry860916, @d5555, @BramVanroy, @F0rge1cE, @richardpaulhudson, @ldorigo, @aaronkub and @devforfu for the pull requests and contributions.