Popularity

1.5

Stable

Activity

0.0

Stable

Stars 74

Watchers 5

Forks 5

Last Commit 12 months ago

Description

Python Testing Crawler is a crawler for automated functional testing of a web application

Crawling a server-side-rendered web application is a low cost way to get low quality test coverage of your JavaScript-light web application.

If you have only partial test coverage of your routes, but still want to protect against silly mistakes, then this is for you.

Features:

* Selectively spider pages and resources, or just request them * Submit forms, and control what values to send * Ignore links by source using CSS selectors * Fail fast or collect many errors * Configurable using straightforward rules

Works with the test clients for Flask (inc Flask-WebTest), Django and WebTest.

Programming language: Python

License: Mozilla Public License 2.0

Tags: Django Flask Web Crawling Testing Web Utilities Dynamic Content

Python Testing Crawler alternatives and similar packages

Based on the "Testing" category.
Alternatively, view Python Testing Crawler alternatives based on common mentions on social networks and blogs.

Selenium

9.8 9.9 L2 Python Testing Crawler VS Selenium

A browser automation framework and ecosystem.
locust

9.6 9.8 L3 Python Testing Crawler VS locust

Write scalable load tests in plain Python 🚗💨

WorkOS - The modern identity platform for B2B SaaS

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

Promo workos.com

faker

9.4 9.4 L4 Python Testing Crawler VS faker

Faker is a Python package that generates fake data for you.
pytest

9.2 9.8 L4 Python Testing Crawler VS pytest

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
Robot Framework

9.1 9.7 L4 Python Testing Crawler VS Robot Framework

Generic automation framework for acceptance testing and RPA
PyAutoGUI

8.9 2.6 L4 Python Testing Crawler VS PyAutoGUI

A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
Moto

8.7 9.9 Python Testing Crawler VS Moto

A library that allows you to easily mock out tests based on AWS infrastructure.
hypothesis

8.4 9.9 L3 Python Testing Crawler VS hypothesis

Hypothesis is a powerful, flexible, and easy to use library for property-based testing.
FuckIt.py

7.8 0.0 L4 Python Testing Crawler VS FuckIt.py

The Python error steamroller.
responses

7.5 8.0 L4 Python Testing Crawler VS responses

A utility for mocking out the Python Requests library.
Mimesis

7.5 9.1 L4 Python Testing Crawler VS Mimesis

Mimesis is a powerful Python library that empowers developers to generate massive amounts of synthetic data efficiently.
Behave

7.4 7.1 Python Testing Crawler VS Behave

BDD, Python style.
tox

7.3 8.9 Python Testing Crawler VS tox

Command line driven CI frontend and development task automation tool.
freezegun

7.3 6.7 L4 Python Testing Crawler VS freezegun

Let your Python tests travel through time
factory_boy

7.1 6.9 L4 Python Testing Crawler VS factory_boy

A test fixtures replacement for Python
splinter

6.9 8.6 L5 Python Testing Crawler VS splinter

splinter - python test framework for web applications
VCR.py

6.6 9.0 L4 Python Testing Crawler VS VCR.py

Automatically mock your HTTP interactions to simplify and speed up testing
httpretty

6.2 0.0 L4 Python Testing Crawler VS httpretty

Intercept HTTP requests at the Python socket level. Fakes the whole socket module
fake2db

6.1 0.0 L2 Python Testing Crawler VS fake2db

create custom test databases that are populated with fake data
Schemathesis

5.9 9.7 Python Testing Crawler VS Schemathesis

Automate your API Testing: catch crashes, validate specs, and save time
sixpack

5.8 0.0 L4 Python Testing Crawler VS sixpack

Sixpack is a language-agnostic a/b-testing framework
nose

5.8 0.0 L3 Python Testing Crawler VS nose

nose is nicer testing for python
Selenium Wire

5.7 0.0 Python Testing Crawler VS Selenium Wire

DISCONTINUED. Extends Selenium's Python bindings to give you the ability to inspect requests made by the browser.
PyRestTest

5.5 0.0 L3 Python Testing Crawler VS PyRestTest

Python Rest Testing
coverage

5.3 - Python Testing Crawler VS coverage

Code coverage measurement.
model_mommy

4.8 0.0 L5 Python Testing Crawler VS model_mommy

DISCONTINUED. Creating random fixtures for testing in Django.
nose2

4.3 7.4 L5 Python Testing Crawler VS nose2

The successor to nose, based on unittest2
mixer

4.3 0.0 L5 Python Testing Crawler VS mixer

Mixer -- Is a fixtures replacement. Supported Django, Flask, SqlAlchemy and custom python objects.
green

4.2 7.8 L2 Python Testing Crawler VS green

Green is a clean, colorful, fast python test runner.
mock

3.8 5.6 L2 Python Testing Crawler VS mock

The Python mock library
betamax

3.7 5.5 Python Testing Crawler VS betamax

A VCR imitation designed only for python-requests.
time-machine

3.6 8.6 Python Testing Crawler VS time-machine

Travel through time in your tests.
mamba

3.6 4.4 L5 Python Testing Crawler VS mamba

The definitive testing tool for Python. Born under the banner of Behavior Driven Development (BDD).
httmock

3.4 0.0 L4 Python Testing Crawler VS httmock

A mocking library for requests
Mocket

2.9 7.6 Python Testing Crawler VS Mocket

a socket mock framework - for all kinds of socket animals, web-clients included
sentry-telegram

2.8 5.3 L5 Python Testing Crawler VS sentry-telegram

Plugin for Sentry which allows sending notification via Telegram messenger.
fakeredis

2.7 9.6 Python Testing Crawler VS fakeredis

Implementation of Redis in python without having a Redis server running. Fully compatible with using redis-py.
Cornell

2.2 4.3 Python Testing Crawler VS Cornell

Cornell - record & replay mock server
Slash

2.1 5.3 Python Testing Crawler VS Slash

The Slash testing infrastructure
picka

1.8 0.0 L3 Python Testing Crawler VS picka

pip install picka - Picka is a python based data generation and randomization module which aims to increase coverage by increasing the amount of tests you _dont_ have to write by hand.
callee

1.7 0.0 L5 Python Testing Crawler VS callee

Argument matchers for unittest.mock
python-libfaketime

1.7 0.0 Python Testing Crawler VS python-libfaketime

A fast time mocking alternative to freezegun that wraps libfaketime.
FauxFactory

1.6 2.6 L5 Python Testing Crawler VS FauxFactory

Generates random data for your tests.
aiounittest

1.5 2.6 Python Testing Crawler VS aiounittest

Test python asyncio-based code with ease.
Vedro

1.2 7.6 Python Testing Crawler VS Vedro

Pragmatic Testing Framework
Mock Generator

1.0 0.0 Python Testing Crawler VS Mock Generator

A tool to auto generate the basic mocks and asserts for faster unit testing
Pytest Mock Generator

0.9 0.0 Python Testing Crawler VS Pytest Mock Generator

A pytest fixture wrapper for https://pypi.org/project/mock-generator
doublex

0.9 - Python Testing Crawler VS doublex

Powerful test doubles framework for Python.
pymox

0.9 9.2 Python Testing Crawler VS pymox

Pymox - Python mocking on steroids
RedExpect

0.8 0.0 Python Testing Crawler VS RedExpect

Automate SSH in python easily!

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.

Do you think we are missing an alternative of Python Testing Crawler or a related project?

Add another 'Testing' Package

Popular Comparisons

README

Python Testing Crawler :snake: :stethoscope: :spider:

A crawler for automated functional testing of a web application

Crawling a server-side-rendered web application is a low cost way to get low quality test coverage of your JavaScript-light web application.

If you have only partial test coverage of your routes, but still want to protect against silly mistakes, then this is for you.

Features:

Selectively spider pages and resources, or just request them
Submit forms, and control what values to send
Ignore links by source using CSS selectors
Fail fast or collect many errors
Configurable using straightforward rules

Works with the test clients for Flask (inc Flask-WebTest), Django and WebTest.

Why should I use this?

Here's an example: Flaskr, the Flask tutorial application has 166 lines of test code to achieve 100% test coverage.

Using Python Testing Crawler in a similar way to the Usage example below, we can hit 73% with very little effort. Disclaimer: Of course! It's not the same quality or utility of testing! But it is better than no tests, a complement to hand-written unit or functional tests and a useful stopgap.

Installation

$ pip install python-testing-crawler

Usage

Create a crawler using your framework's existing test client, tell it where to start and what rules to obey, then set it off:

from python_testing_crawler import Crawler
from python_testing_crawler import Rule, Request, Ignore, Allow

def test_crawl_all():
    client = ## ... existing testing client
    ## ... any setup ...
    crawler = Crawler(
        client=my_testing_client,
        initial_paths=['/'],
        rules=[
            Rule("a", '/.*', "GET", Request()),
        ]
    )
    crawler.crawl()

This will crawl all anchor links to relative addresses beginning "/". Any exceptions encountered will be collected and presented at the end of the crawl. For more power see the Rules section below.

If you need to authorise the client's session, e.g. login, then you should that before creating the Crawler.

It is also a good idea to create enough data, via fixtures or otherwise, to expose enough endpoints.

How do I setup a test client?

It depends on your framework:

Crawler Options

Param	Description
`initial_paths`	list of paths/URLs to start from
`rules`	list of Rules to control the crawler; see below
`path_attrs`	list of attribute names to extract paths/URLs from; defaults to "href" -- include "src" if you want to check e.g. `<link>`, `<script>` or even `<img>`
`ignore_css_selectors`	any elements matching this list of CSS selectors will be ignored when extracting links
`ignore_form_fields`	list of form input names to ignore when determining the identity/uniqueness of a form. Include CSRF token field names here.
`max_requests`	Crawler will raise an exception if this limit is exceeded
`capture_exceptions`	upon encountering an exception, keep going and fail at the end of the crawl instead of during (default `True`)
`output_summary`	print summary statistics and any captured exceptions and tracebacks at the end of the crawl (default `True`)
`should_process_handlers`	list of "should process" handlers; see Handlers section
`check_response_handlers`	list of "check response" handlers; see Handlers section

Rules

The crawler has to be told what URLs to follow, what forms to post and what to ignore, using Rules.

Rules are made of four parameters:

Rule(<source element regex>, <target URL/path regex>, <HTTP method>, <action to take>)

These are matched against every HTML element that the crawler encounters, with the last matching rule winning.

Actions must be one of the following objects:

Request(only=False, params=None) -- follow a link or submit a form
- only=True will retrieve a page/resource but not spider its links.
- the dict params allows you to specify overrides for a form's default values
Ignore() -- do nothing / skip
Allow(status_codes) -- allow a HTTP status in the supplied list, i.e. do not consider it an error.

Example Rules

Follow all local/relative links

HYPERLINKS_ONLY_RULE_SET = [
    Rule('a', '/.*', 'GET', Request()),
    Rule('area', '/.*', 'GET', Request()),
]

Request but do not spider all links

REQUEST_ONLY_EXTERNAL_RULE_SET = [
    Rule('a', '.*', 'GET', Request(only=True)),
    Rule('area', '.*', 'GET', Request(only=True)),
]

This is useful for finding broken links. You can also check <link> tags from the <head> if you include the following rule plus set the Crawler's path_attrs to ("HREF", "SRC").

Rule('link', '.*', 'GET', Request())

Submit forms with GET or POST

SUBMIT_GET_FORMS_RULE_SET = [
    Rule('form', '.*', 'GET', Request())
]

SUBMIT_POST_FORMS_RULE_SET = [
    Rule('form', '.*', 'POST', Request())
]

Forms are submitted with their default values, unless overridden using Request(params={...}) for a specific form target or excluded using (globally) using the ignore_form_fields parameter to Crawler (necessary for e.g. CSRF token fields).

Allow some routes to fail

PERMISSIVE_RULE_SET = [
    Rule('.*', '.*', 'GET', Allow([*range(400, 600)])),
    Rule('.*', '.*', 'POST', Allow([*range(400, 600)]))
]

If any HTTP error (400-599) is encountered for any request, allow it; do not error.

Crawl Graph

The crawler builds up a graph of your web application. It can be interrogated via crawler.graph when the crawl is finished.

See [the graph module](python_testing_crawler/graph.py) for the defintion of Node objects.

Handlers

Two hooks points are provided. These operate on Node objects (see above).

Whether to process a Node

Using should_process_handlers, you can register functions that take a Node and return a bool of whether the Crawler should "process" -- follow a link or submit a form -- or not.

Whether a response is acceptable

Using check_response_handlers, you can register functions that take a Node and response object (specific to your test client) and return a bool of whether the response should constitute an error.

If your function returns True, the Crawler with throw an exception.

Examples

There are currently Flask and Django examples in [the tests](tests/).

See https://github.com/python-testing-crawler/flaskr for an example of integrating into an existing application, using Flaskr, the Flask tutorial application.

*Note that all licence references and agreements mentioned in the Python Testing Crawler README section above are relevant to that project's source code only.