pyspider v0.3.10 Release Notes

Release Date: 2018-04-18 // over 3 years ago
  • ๐Ÿ†• New features:

    • โž• add phantomjs proxy support #692 @volvofixthis
    • ๐Ÿ‘Œ support redis 3.x in cluster mode for message queue @hackty

    ๐Ÿ›  Fix several bugs:

    • ๐Ÿ‘Œ Improve the performance of counter.to_dict
    • ๐Ÿ›  Fixed issue of counter changed during read
    • ๐Ÿ›  Fix tornado version dependency in setup.py

Previous changes from v0.3.9

  • ๐Ÿ†• New features:

    • ๐Ÿ‘Œ Support for Python 3.6.
    • โฑ Auto Pause: the project will be paused for scheduler.PAUSE_TIME (default: 5min) when last scheduler.FAIL_PAUSE_NUM (default: 10) task failed, and dispatch scheduler.UNPAUSE_CHECK_NUM (default: 3) tasks after scheduler.PAUSE_TIME. Project will resume if any one of last scheduler.UNPAUSE_CHECK_NUM tasks success.
    • 0๏ธโƒฃ Each callback now have a default 30s process time limit. (Platform support required) @beader
    • ๐Ÿ†• New Javascript render engine - Splash support: Enabled by fetch argument --splash-endpoint=http://splash:8050/execute
    • ๐Ÿ‘ Python3 webdav support.
    • ๐Ÿ‘ Python3 from projects import project support.
    • A link to corresponding task is added to webui debug page when debugging a exists task in webui.
    • ๐Ÿ†• New user_agent parameter in self.crawl, you can set user-agent by headers though.

    ๐Ÿ›  Fix several bugs:

    • ๐Ÿ†• New webui dashboard frontend framework - vue.js, improved the performance when having large number of tasks (e.g. http://demo.pyspider.org/)
    • ๐Ÿ›  Fix crawl_config doesn't work in webui while debugging a script issue.
    • ๐Ÿ›  Fix CSS Selector Helper doesn't work issue. @ackalker
    • ๐Ÿ›  Fix connection_timeout not working issue.
    • ๐Ÿ›  FIx need_auth option not applied on webdav issue.
    • ๐Ÿ›  Fix "fix can't dump counter to file: scheduler.all" error.
    • ๐Ÿ›  Some other fixes