textacy v0.6.2 Release Notes
Release Date: 2018-07-19 // over 5 years ago-
๐ Changes:
- โ Add a
spacier.util
module, and add / reorganize relevant functionality- move (most)
spacy_util
functions here, and add a deprecation warning to
thespacy_util
module - rename
normalized_str()
=>get_normalized_text()
, for consistency and clarity - add a function to split long texts up into chunks but combine them into
โช a singleDoc
. This is a workaround for a current limitation of spaCy's
neural models, whose RAM usage scales with the length of input text.
- move (most)
- โ Add experimental support for reading and writing spaCy docs in binary format,
๐ where multiple docs are contained in a single file. This functionality was
๐ supported by spaCy v1, but is not in spaCy v2; I've implemented a workaround
that should work well in most situations, but YMMV. - ๐ Package documentation is now "officially" hosted on GitHub pages. The docs
๐ are automatically built on and deployed from Travis viadoctr
, so they
stay up-to-date with the master branch on GitHub. Maybe someday I'll get
๐ ReadTheDocs to successfully buildtextacy
once again...- Minor improvements/updates to documentation
๐ Bugfixes:
- Add missing return statement in deprecated
text_stats.flesch_readability_ease()
function (Issue #191) - ๐ Catch an empty graph error in bestcoverage-style keyterm ranking (Issue #196)
- ๐ Fix mishandling when specifying a single named entity type to in/exclude in
extract.named_entities
(Issue #202) - ๐ Make
networkx
usage in keyterms module compatible with v1.11+ (Issue #199)
- โ Add a