newspaper v0.1.1 Release Notes

Release Date: 2014-12-27 // over 9 years ago
  • Full Changelog

    Closed issues:

    • UnicodeDecodeError: 'utf8' codec can't decode byte 0xcc #99
    • TypeError: Can't convert 'bytes' object to str implicitly #98
    • ๐Ÿ“œ [Parse lxml ERR] Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration. #78
    • UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 11: ordinal not in range(128) #77
    • article.text and keywords error #47

    ๐Ÿ”€ Merged pull requests:

    • ๐Ÿ›  Huge bugfix to aid lxml DOM parsing + remove unhelpful and excess exception messages and added tracebacks to exception logging #102 (codelucas)
    • โœ… Decode bytestring returned from lxml's toString early on before sending it out to outer code #101 (codelucas)
    • ๐Ÿ›  Fixed #78: Remove encoding tag because lxml won't accept it for unicode #97 (mhall1)