AWS Data Wrangler v1.7.0 release notes (2020-07-30)

« Changelog History

AWS Data Wrangler v1.7.0 Release Notes

Release Date: 2020-07-30 // over 3 years ago

💥 Breaking changes
- The partitioned parquet reading now has a different approach for pushdown filters. For details check the tutorial
🆕 New Functionalities
- 🔧 Global configuration module - TUTORIAL
- Concurrently partitions write - TUTORIAL
- Flexible Partitions Filter (PUSH-DOWN) - TUTORIAL
- Add Athena query metadata to Pandas DataFrames returned by wr.athane.read_sql_*() - TUTORIAL #331
- wr.athena.describe_table() #329
- wr.athena.show_create_table() #334
- Add path_ignore_suffix argument to all read functions #326
✨ Enhancements
- 👌 Support for PyArrow 1.0.0 #337
- 👌 Support for Pandas 1.1.0
- 👌 Support writing encrypted redshift copy manifest to S3 #327
- wr.athane.read_sql_*() now accepts empty results #299
- 👍 Allow connect_args to be passed when creating an SQL engine from a glue connection #309
- Add skip_header_line_count argument to wr.catalog.create_csv_table() #338
🐛 Bug Fix
- ➕ Add missing type annotations and fix types in docstrings. #321
- KeyError: 'StatementType' with Athena using max_cache_seconds #323
- wr.s3.read_csv() slow with chunksize #324
- wr.s3.read_csv() with "chunksize" does not forward pandas_kwargs "encoding" #330
- Ensure DataFrame mutability for wr.athane.read_sql_*() w/ ctas_approach=True #335
📄 Docs
- ⚡️ Several small updates.
Thanks

🚀 We thank the following contributors/users for their work on this release:

@kylepierce, @davidszotten, @meganburger, @erikcw, @JPFrancoia, @zacharycarter, @DavideBossoli88, @c-line, @anand086, @jasadams, @mrtns, @schot, @koiker, @flaviomax, @bryanyang0528, @igorborgest.

_ P.S. _ Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

_ P.P.S. _ AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).