Selected Tags
Click on a tag to remove itMore Tags
Click on a tag to add it and filter down-
Markup
-
Internet
-
HTTP
-
WWW
-
Web Content Extracting
-
Utilities
-
HTML Manipulation
-
Scientific
-
Linguistic
-
XML
-
Engineering
-
Dynamic Content
-
Specific Formats Processing
-
Multimedia
-
Markdown
-
Printing
-
Documentation
-
Web Crawling
-
Information Analysis
-
Filters
-
Indexing
-
Graphics
-
Site Management
-
Scraping
-
Web Scraping
-
Parser
-
Education
-
Template Engine
-
General
HTML packages
Showing projects tagged as Text Processing and HTML
-
Pattern
9.1 0.0 L2 PythonWeb mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization. -
Python-Markdown
7.7 7.1 PythonA Python implementation of John Gruber’s Markdown with Extension support. -
aeneas
6.4 0.0 L3 Pythonaeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment) -
html5lib
5.1 0.0 L2 PythonStandards-compliant library for parsing and serializing HTML documents and fragments in Python -
trafilatura
4.6 9.2 PythonPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments -
selectolax
4.2 5.8 CythonPython binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors). -
htmldate
1.7 7.1 PythonFast and robust date extraction from web pages, with Python or on the command-line
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.