toapi alternatives and similar packages
Based on the "Web Content Extracting" category.
Alternatively, view toapi alternatives based on common mentions on social networks and blogs.
-
TWINT
DISCONTINUED. An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations. -
newspaper
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs: -
python-readability
fast python port of arc90's readability tool, updated to match latest readability.js! -
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments -
inscriptis -- HTML to text conversion library, command line client and Web service
A python based HTML to text conversion library, command line client and Web service.
InfluxDB - Power Real-Time Data Analytics at Scale
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of toapi or a related project?
README
Toapi
[Toapi](logo.png)
Overview
Toapi give you the ability to make every web site provides APIs.
Version v2.0.0, Completely rewrote.
More elegant. More pythonic
- v1.0.0 Documentation: http://www.toapi.org
- Awesome: https://github.com/toapi/awesome-toapi
- Organization: https://github.com/toapi
Features
- Automatic converting HTML web site to API service.
- Automatic caching every page of source site.
- Automatic caching every request.
- Support merging multiple web sites into one API service.
Get Started
Installation
$ pip install toapi
$ toapi -v
toapi, version 2.0.0
Usage
create app.py
and copy the code:
from flask import request
from htmlparsing import Attr, Text
from toapi import Api, Item
api = Api()
@api.site('https://news.ycombinator.com')
@api.list('.athing')
@api.route('/posts?page={page}', '/news?p={page}')
@api.route('/posts', '/news?p=1')
class Post(Item):
url = Attr('.storylink', 'href')
title = Text('.storylink')
@api.site('https://news.ycombinator.com')
@api.route('/posts?page={page}', '/news?p={page}')
@api.route('/posts', '/news?p=1')
class Page(Item):
next_page = Attr('.morelink', 'href')
def clean_next_page(self, value):
return api.convert_string('/' + value, '/news?p={page}', request.host_url.strip('/') + '/posts?page={page}')
api.run(debug=True, host='0.0.0.0', port=5000)
run python app.py
then open your browser and visit http://127.0.0.1:5000/posts?page=1
you will get the result like:
{
"Page": {
"next_page": "http://127.0.0.1:5000/posts?page=2"
},
"Post": [
{
"title": "Mathematicians Crack the Cursed Curve",
"url": "https://www.quantamagazine.org/mathematicians-crack-the-cursed-curve-20171207/"
},
{
"title": "Stuffing a Tesla Drivetrain into a 1981 Honda Accord",
"url": "https://jalopnik.com/this-glorious-madman-stuffed-a-p85-tesla-drivetrain-int-1823461909"
}
]
}
Todo
- Visualization. Create toapi project in a web page by drag and drop.
Contributing
Write code and test code and pull request.
*Note that all licence references and agreements mentioned in the toapi README section above
are relevant to that project's source code only.