pyspider v0.3.10 Release Notes

Release Date: 2018-04-18 // over 1 year ago
  • 🆕 New features:

    🛠 Fix several bugs:

    • 👌 Improve the performance of counter.to_dict
    • 🛠 Fixed issue of counter changed during read
    • 🛠 Fix tornado version dependency in

Previous changes from v0.3.9

  • 🆕 New features:

    • 👌 Support for Python 3.6.
    • ⏱ Auto Pause: the project will be paused for scheduler.PAUSE_TIME (default: 5min) when last scheduler.FAIL_PAUSE_NUM (default: 10) task failed, and dispatch scheduler.UNPAUSE_CHECK_NUM (default: 3) tasks after scheduler.PAUSE_TIME. Project will resume if any one of last scheduler.UNPAUSE_CHECK_NUM tasks success.
    • 0️⃣ Each callback now have a default 30s process time limit. (Platform support required) @beader
    • 🆕 New Javascript render engine - Splash support: Enabled by fetch argument --splash-endpoint=http://splash:8050/execute
    • 👍 Python3 webdav support.
    • 👍 Python3 from projects import project support.
    • A link to corresponding task is added to webui debug page when debugging a exists task in webui.
    • 🆕 New user_agent parameter in self.crawl, you can set user-agent by headers though.

    🛠 Fix several bugs:

    • 🆕 New webui dashboard frontend framework - vue.js, improved the performance when having large number of tasks (e.g.
    • 🛠 Fix crawl_config doesn't work in webui while debugging a script issue.
    • 🛠 Fix CSS Selector Helper doesn't work issue. @ackalker
    • 🛠 Fix connection_timeout not working issue.
    • 🛠 FIx need_auth option not applied on webdav issue.
    • 🛠 Fix "fix can't dump counter to file: scheduler.all" error.
    • 🛠 Some other fixes