All Versions
11
Latest Version
Avg Release Cycle
63 days
Latest Release
2132 days ago

Changelog History
Page 1

  • v0.6.41 Changes

    June 24, 2018

    ๐Ÿ”„ Changed

    • ๐Ÿ‘€ Restrict pycurl version to <7.43.0.1 (see #354)
  • v0.6.40 Changes

    May 13, 2018

    ๐Ÿ›  Fixed

    +- ๐Ÿ›  Fix #346: spider does not process initial_urls
    +- ๐Ÿ›  Fix #344: raise GrabInvalidUrl for pycurl error 3

  • v0.6.39 Changes

    May 09, 2018

    ๐Ÿ›  Fixed

    • ๐Ÿ›  Fix bug: task generator works incorrectly
    • ๐Ÿ›  Fix bug: pypi package misses http api html file
    • ๐Ÿ›  Fix bug: dictionary changed size during iteration in stat logging
    • ๐Ÿ›  Fix bug: multiple errors in urllib3 transport and threaded network service
    • ๐Ÿ›  Fix short names of errors in stat logging
    • ๐Ÿ‘Œ Improve error handling in urrllib3 transport
    • ๐Ÿ›  Fix #299: multi-added errors
    • ๐Ÿ›  Fix bug: pypi package misses http api html file
    • ๐Ÿ›  Fix #285: pyquery extension parses html incorrectly
    • ๐Ÿ›  Fix #267: normalize handling of too many redirect error
    • ๐Ÿ›  Fix #268: fix processing of utf cookies
    • ๐Ÿ›  Fix #241: form_fields() fails on some HTML forms
    • ๐Ÿ›  Fix normalize_unicode issue in debug post method
    • ๐Ÿ›  Fix #323: urllib3 transport fails with UnicodeError on some invalid URLs
    • ๐Ÿ›  Fix #31: support for multivalue form inputs
    • ๐Ÿ›  Fix #328, fix #67: remove hard link between document and grab
    • ๐Ÿ›  Fix #284: option headers affects content of common_headers
    • ๐Ÿ›  Fix #293: processing non-latin chars in Location header
    • ๐Ÿ›  Fix #324: refactor response header processing

    ๐Ÿ”„ Changed

    • ๐Ÿ”จ Refactor Spider into set of async. services
    • โž• Add certifi dependency into grab[full] setup target
    • ๐Ÿ›  Fix #315: use psycopg2-binary package for postgres cache
    • Related to #206: do not use connection_reuse=False for proxy connections in spider

    โœ‚ Removed

    • โœ‚ Remove cache timeout option
    • โœ‚ Remove structured extension
  • v0.6.38 Changes

    May 17, 2017

    ๐Ÿ›  Fixed

    • ๐Ÿ›  Fix "error:None" in spider rps logging
    • ๐Ÿ›  Fix race condition bug in task generator

    โž• Added

    • โž• Add original_exc attribute to GrabNetworkError (and subclasses) that points to original exception

    ๐Ÿ”„ Changed

    • โœ‚ Remove IOError from the ancestors of GrabNetworkError
    • โž• Add default values to --spider-transport and --grab-transport options of crawl script
  • v0.6.37 Changes

    May 13, 2017

    โž• Added

    • โž• Add --spider-transport and --grab-transport options to crawl script
    • โž• Add SOCKS5 proxy support in urllib3 transport

    ๐Ÿ›  Fixed

    • ๐Ÿ›  Fix #237: urllib3 transport fails without pycurl installed
    • ๐Ÿ›  Fix bug: incorrect spider request logging when cache is enabled
    • ๐Ÿ›  Fix bug: crawl script fails while trying to process a lock key
    • ๐Ÿ›  Fix bug: urllib3 transport fails while trying to throw GrabConnectionError exception
    • ๐Ÿ›  Fix bug: Spider add_task method fails while trying to log invalid URL error

    โœ‚ Removed

    • Remove obsoleted hammer_mode and hammer_timeout config options
  • v0.6.36 Changes

    February 12, 2017

    โž• Added

    • โž• Add pylint to default test set

    ๐Ÿ›  Fixed

    • ๐Ÿ›  Fix #229: using deprecated response object inside Grab

    โœ‚ Removed

    • โœ‚ Remove spider project template and start_project script
  • v0.6.35 Changes

    February 06, 2017

    ๐Ÿ›  Fixed

    • ๐Ÿ›  Fix bug in deprecated grab.choose_form method
    • โž• Add default project templates files to the distribution, by @rushter
    • ๐Ÿ›  Fix #222: debug_post option fails with big post data
    • ๐Ÿ›  Fix #148: pycurl ignores sigint signal
  • v0.6.34 Changes

    February 04, 2017

    โž• Added

    • โœ… Start running Grab tests in OSX environment on travis CI

    ๐Ÿ”„ Changed

    • ๐Ÿ“œ Use defusedxml library to parse HTML and XML, by @kevinlondon
    • Put selection, lxml and pycurl libs back to required dependencies in setup.py
    • ๐Ÿ“š Update installation documentation
  • v0.6.33 Changes

    January 28, 2017

    โž• Added

    • โž• Add API documentation about few grab modules, by @rushter
    • ๐Ÿ Start running Grab tests in Windows enviroment on appveyor CI
    • ๐Ÿ†• New spider transport based on threads that allows to use Spider with any Grab network backend e.g. urllib3
    • Add remove_from_post option to grab.doc.submit method
    • โž• Add random option to grab.change_proxy method
    • ๐Ÿ‘Œ Support for deprecated attributes Spider.items and Spider.counters
    • If Spider handler raises ResponseNotValid exception, then that task goes back to task queue until task.task_try_count reaches the spider.task_try_limit

    ๐Ÿ”„ Changed

    • ๐Ÿ”จ Refactor management of internal threads, fix random test failures related to cache sub-module
    • 0๏ธโƒฃ Disable default logging to files while running spider by run crawl command
    • Multiple improvements in urllib3 transport
    • 0๏ธโƒฃ Set default spider network & try limits to 3 (was 10)

    ๐Ÿ›  Fixed

    • Different bugs in urllib3 transport
    • Different bugs

    โœ‚ Removed

    • Remove grab.use_next_proxy method
    • โœ‚ Remove grab.dump method
    • โœ‚ Remove deprecated Spider methods and attributes
  • v0.6.32 Changes

    December 31, 2017

    ๐Ÿ›  Fixed

    • ๐Ÿ›  Fix setup.py