All Versions
17
Latest Version
Avg Release Cycle
37 days
Latest Release
-
Changelog History
Page 1
Changelog History
Page 1
-
v3.1.12 Changes
- ๐ Fix for Korean stop words see PR #138; Thanks galaxytemple
- ๐ Allow for extra dependencies see issue #141
- ๐ Fix leading and trailing charset characters see issue #139; Thanks @nnick14
- โ Added basic logging and typing
-
v3.1.11 Changes
- ๐ Replace
md5
with a pure pythonfnv_1a
non-cryptographic hash see issue #133; Thanks @openbrian
- ๐ Replace
-
v3.1.10 Changes
- ๐ Fix for float based timezones see issue #128 Thanks @Vasniktel!
- โ Add
langdetect
dependency to help resolve some edge cases when missing language information causes text to not be pulled. see issue #106
-
v3.1.9 Changes
- ๐ Fix for removing site name from title when it is part of the title see issue #123
- ๐ Fix parsing encoding string when encoding information is capitalized see issue #109
-
v3.1.8 Changes
- ๐ Fixed title being an empty string when the title is the same as the site name see PR #117 Thanks @Pradhvan
- โ Add optional removal of footnotes see issue #105
-
v3.1.7 Changes
- ๐ Fixed author configuration see PR #96
- ๐ Improve parent node scoring to get more of the correct data see PR #102 Thanks @skruse
-
v3.1.6 Changes
October 20, 2018- ๐ Improved handling of page encoding see PR #92
- ๐ Improved author and published date extraction see PR #93 Thanks @timoilya!
- โ Added additional schema extractors for schema.org parser see PR #89
- ๐ Allow for pulling more then the first og:type data for Opengraph see PR #90
-
v3.1.5 Changes
September 11, 2018- โ Added additional date parsing see PR #71 Thanks @dlrobertson!
- Added datetime representation of the publish date
publish_datetime_utc
see issue #72 - ๐ Fixed mismatch encoding error see issue #74
- ๐ Fixed og_type with NoneType error see issue #81 Thanks dust0x!
-
v3.1.4 Changes
August 19, 2018- ๐ Fix IndexError when title has only an title splitter or is the site name see issue #59 Thanks @dlrobertson!
- Retry the calculate_top_node function with the root node if the first pass failed to find an article which may occur if one or more known article patterns are found, but none contain content see PR #66 Thanks @dlrobertson!
- โ Add parsing of schema.org's ReportageNewsArticle tags see PR #67 Thanks @dlrobertson!
- โ Add additional parsing of opengraph tags see PR #64 Thanks @dlrobertson!
-
v3.1.3 Changes
July 07, 2018- ๐ Parse headers and include in
cleaned_text
- โ Additional Configuration options:
- Parse Headers:
parse_headers
- Parse Lists:
parse_lists
- Pretty Lists:
pretty_lists
- Parse Headers:
- ๐ Catch mismatch encoding meta tag and document encoding see pull request #53 Thanks @jeffquach!
- ๐ Parse headers and include in