textacy v0.4.2 Release Notes

Release Date: 2017-11-29 // over 6 years ago
  • ๐Ÿ”„ Changes:

    • โž• Added a CLI for downloading textacy-related data, inspired by the spaCy
      equivalent. It's temporarily undocumented, but to see available commands and
      options, just pass the usual flag: $ python -m textacy --help. Expect more
      ๐Ÿ“„ functionality (and docs!) to be added soonish. (#144)
      • Note: The existing Dataset.download() methods work as before, and in fact,
        ๐Ÿ’ป they are being called under the hood from the command line.
    • Made usage of networkx v2.0-compatible, and therefore dropped the <2.0
      ๐Ÿ”– version requirement on that dependency. Upgrade as you please! (#131)
    • ๐Ÿ‘Œ Improved the regex for identifying phone numbers so that it's easier to view
      and interpret its matches. (#128)

    ๐Ÿ›  Bugfixes:

    • ๐Ÿ›  Fixed caching of counts on textacy.Doc to make it instance-specific, rather than
      shared by all instances of the class. Oops.
    • ๐Ÿ›  Fixed currency symbols regex, so as not to replace all instances of the letter "z"
      when a custom string is passed into replace_currency_symbols(). (#137)
    • ๐Ÿ›  Fixed README usage example, which skipped downloading of dataset data. Btw,
      ๐Ÿ‘€ see above for another way! (#124)
    • ๐Ÿ›  Fixed typo in the API reference, which included the SupremeCourt dataset twice
      and omitted the RedditComments dataset. (#129)
    • ๐Ÿ›  Fixed typo in RedditComments.download() that prevented it from downloading
      any data. (#143)

    Contributors:

    Many thanks to @asifm, @harryhoch, and @mdlynch37 for submitting PRs!