Selected Tags
Click on a tag to remove itMore Tags
Click on a tag to add it and filter down-
Markup
-
Internet
-
HTTP
-
WWW
-
Web Content Extracting
-
Utilities
-
XML
-
Engineering
-
HTML Manipulation
-
Linguistic
-
Scientific
-
Specific Formats Processing
-
Markdown
-
Printing
-
Dynamic Content
-
Multimedia
-
Documentation
-
Indexing
-
Filters
-
Information Analysis
-
Web Crawling
-
Site Management
-
General
-
Parser
-
Scraping
-
Graphics
-
Template Engine
-
Education
-
Web Scraping
Text Processing packages
Showing projects tagged as HTML and Text Processing
-
Pattern
9.0 0.0 L2 PythonWeb mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization. -
Python-Markdown
7.7 7.0 PythonA Python implementation of John Gruber’s Markdown with Extension support. -
trafilatura
6.9 9.0 PythonPython & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML -
aeneas
6.4 0.0 L3 Pythonaeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment) -
html5lib
5.2 4.1 L2 PythonStandards-compliant library for parsing and serializing HTML documents and fragments in Python -
selectolax
4.5 6.7 CythonPython binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors). -
htmldate
2.1 7.1 PythonFast and robust date extraction from web pages, with Python or on the command-line
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.