AWS Data Wrangler v1.7.0 Release Notes

Release Date: 2020-07-30 // over 3 years ago
  • ๐Ÿ’ฅ Breaking changes

    • The partitioned parquet reading now has a different approach for pushdown filters. For details check the tutorial

    ๐Ÿ†• New Functionalities

    โœจ Enhancements

    • ๐Ÿ‘Œ Support for PyArrow 1.0.0 #337
    • ๐Ÿ‘Œ Support for Pandas 1.1.0
    • ๐Ÿ‘Œ Support writing encrypted redshift copy manifest to S3 #327
    • wr.athane.read_sql_*() now accepts empty results #299
    • ๐Ÿ‘ Allow connect_args to be passed when creating an SQL engine from a glue connection #309
    • Add skip_header_line_count argument to wr.catalog.create_csv_table() #338

    ๐Ÿ› Bug Fix

    • โž• Add missing type annotations and fix types in docstrings. #321
    • KeyError: 'StatementType' with Athena using max_cache_seconds #323
    • wr.s3.read_csv() slow with chunksize #324
    • wr.s3.read_csv() with "chunksize" does not forward pandas_kwargs "encoding" #330
    • Ensure DataFrame mutability for wr.athane.read_sql_*() w/ ctas_approach=True #335

    ๐Ÿ“„ Docs

    • โšก๏ธ Several small updates.

    Thanks

    ๐Ÿš€ We thank the following contributors/users for their work on this release:

    @kylepierce, @davidszotten, @meganburger, @erikcw, @JPFrancoia, @zacharycarter, @DavideBossoli88, @c-line, @anand086, @jasadams, @mrtns, @schot, @koiker, @flaviomax, @bryanyang0528, @igorborgest.


    _ P.S. _ Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

    _ P.P.S. _ AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).