Selected Tags
Click on a tag to remove itMore Tags
Click on a tag to add it and filter down-
Internet
-
WWW
-
HTTP
-
HTML Manipulation
-
Scientific
-
Utilities
-
Linguistic
-
Engineering
-
Web Content Extracting
-
Multimedia
-
Specific Formats Processing
-
XML
-
Dynamic Content
-
Markdown
-
Printing
-
Documentation
-
Web Crawling
-
Information Analysis
-
Filters
-
Indexing
-
Graphics
-
Site Management
-
Scraping
-
Web Scraping
-
Parser
-
Education
-
Template Engine
-
General
Markup packages
Showing projects tagged as Text Processing, HTML, and Markup
-
Pattern
9.1 0.0 L2 PythonWeb mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization. -
Python-Markdown
7.6 7.9 PythonA Python implementation of John Gruber’s Markdown with Extension support. -
aeneas
6.3 0.0 L3 Pythonaeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment) -
html5lib
5.1 0.0 L2 PythonStandards-compliant library for parsing and serializing HTML documents and fragments in Python -
trafilatura
4.2 8.2 PythonPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments -
selectolax
4.0 6.9 CythonPython binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors). -
htmldate
1.6 8.4 PythonFast and robust date extraction from web pages, with Python or on the command-line
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.