Find original and updated publication dates of any web page. From the command-line or within Python, all the steps needed from web page download to HTML parsing, scraping, and text analysis are included.
Programming language: Python
License: GNU General Public License v3.0 only
Tags: Date And Time Text Processing HTTP Web Content Extracting HTML Scientific Engineering Information Analysis Internet WWW Markup Linguistic Web Scraping Scraping Content Extraction Metadata
Latest version: v1.3.2
htmldate alternatives and similar packages
Based on the "Web Content Extracting" category.
Alternatively, view htmldate alternatives based on common mentions on social networks and blogs.
TWINT9.4 0.0 htmldate VS TWINTAn advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
newspaper9.3 0.0 L3 htmldate VS newspaperNews, full-text, and article metadata extraction in Python 3. Advanced docs:
python-goose8.0 0.0 htmldate VS python-gooseHtml Content / Article Extractor, web scrapping lib in Python
textract7.5 0.0 htmldate VS textractextract text from any document. no muss. no fuss.
sumy7.3 6.2 L5 htmldate VS sumyModule for automatic summarization of text documents and HTML pages.
toapi7.1 0.0 htmldate VS toapiEvery web site provides APIs.
python-readability6.6 3.9 htmldate VS python-readabilityfast python port of arc90's readability tool, updated to match latest readability.js!
html2text5.4 0.0 L1 htmldate VS html2textConvert HTML to Markdown-formatted text.
trafilatura4.4 7.4 htmldate VS trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Goose33.9 0.0 htmldate VS Goose3A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html
micawber3.9 0.0 L5 htmldate VS micawbera small library for extracting rich content from urls
lassie3.7 0.0 L4 htmldate VS lassieWeb Content Retrieval for Humans™
opengraph2.9 0.0 L5 htmldate VS opengraphA python module to parse the Open Graph Protocol
Haul2.4 0.0 L5 htmldate VS HaulAn Extensible Image Crawler
inscriptis -- HTML to text conversion library, command line client and Web service2.2 8.3 htmldate VS inscriptis -- HTML to text conversion library, command line client and Web serviceA python based HTML to text conversion library, command line client and Web service.
sanitize1.4 0.0 L4 htmldate VS sanitizeBringing sanity to world of messed-up data
JSONPATH1.0 2.1 htmldate VS JSONPATHA query expression for extracting data from JSON.
Data Extractor0.9 2.1 htmldate VS Data ExtractorCombine XPath, CSS Selectors and JSONPath for Web data extracting.
Access the most powerful time series database as a service
Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of htmldate or a related project?