Goose3/CHANGELOG and Goose3 Releases

All Versions

Latest Version

3.1.12

Avg Release Cycle

37 days

Latest Release

Changelog History

Page 1

v3.1.12 Changes
- 🛠 Fix for Korean stop words see PR #138; Thanks galaxytemple
- 👍 Allow for extra dependencies see issue #141
- 🛠 Fix leading and trailing charset characters see issue #139; Thanks @nnick14
- ➕ Added basic logging and typing
v3.1.11 Changes
- 👀 Replace md5 with a pure python fnv_1a non-cryptographic hash see issue #133; Thanks @openbrian
v3.1.10 Changes
- 🛠 Fix for float based timezones see issue #128 Thanks @Vasniktel!
- ➕ Add langdetect dependency to help resolve some edge cases when missing language information causes text to not be pulled. see issue #106
v3.1.9 Changes
- 🛠 Fix for removing site name from title when it is part of the title see issue #123
- 🛠 Fix parsing encoding string when encoding information is capitalized see issue #109
v3.1.8 Changes
- 🛠 Fixed title being an empty string when the title is the same as the site name see PR #117 Thanks @Pradhvan
- ➕ Add optional removal of footnotes see issue #105
v3.1.7 Changes
- 🛠 Fixed author configuration see PR #96
- 👌 Improve parent node scoring to get more of the correct data see PR #102 Thanks @skruse
v3.1.6 Changes
October 20, 2018
- 👌 Improved handling of page encoding see PR #92
- 👌 Improved author and published date extraction see PR #93 Thanks @timoilya!
- ➕ Added additional schema extractors for schema.org parser see PR #89
- 👍 Allow for pulling more then the first og:type data for Opengraph see PR #90
v3.1.5 Changes
September 11, 2018
- ➕ Added additional date parsing see PR #71 Thanks @dlrobertson!
- Added datetime representation of the publish date publish_datetime_utc see issue #72
- 🛠 Fixed mismatch encoding error see issue #74
- 🛠 Fixed og_type with NoneType error see issue #81 Thanks dust0x!
v3.1.4 Changes
August 19, 2018
- 🛠 Fix IndexError when title has only an title splitter or is the site name see issue #59 Thanks @dlrobertson!
- Retry the calculate_top_node function with the root node if the first pass failed to find an article which may occur if one or more known article patterns are found, but none contain content see PR #66 Thanks @dlrobertson!
- ➕ Add parsing of schema.org's ReportageNewsArticle tags see PR #67 Thanks @dlrobertson!
- ➕ Add additional parsing of opengraph tags see PR #64 Thanks @dlrobertson!
v3.1.3 Changes
July 07, 2018
- 📜 Parse headers and include in cleaned_text
- ➕ Additional Configuration options:
  - Parse Headers: parse_headers
  - Parse Lists: parse_lists
  - Pretty Lists: pretty_lists
- 👀 Catch mismatch encoding meta tag and document encoding see pull request #53 Thanks @jeffquach!

Goose3 changelog

A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html

Changelog History

Page 1

v3.1.12 Changes

v3.1.11 Changes

v3.1.10 Changes

v3.1.9 Changes

v3.1.8 Changes

v3.1.7 Changes

v3.1.6 Changes

v3.1.5 Changes

v3.1.4 Changes

v3.1.3 Changes

Goose3 changelog

A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html

Changelog History Page 1

Changelog History

Page 1