trafilatura v0.7.0 Release Notes

    • ๐Ÿ”ง customizable configuration file to parametrize extraction and downloads
    • ๐Ÿ‘ better handling of feeds and sitemaps
    • โž• additional CLI options: crytographic hash for file name, use Internet Archive as backup
    • more precise extraction
    • faster downloads: requests replaced with bare urllib3 and custom decoding
    • ๐Ÿ›  consolidation: bug fixes and improvements, many thanks to the issues reporters!