AWS Data Wrangler v1.7.0 Release Notes
Release Date: 2020-07-30 // over 3 years ago-
๐ฅ Breaking changes
- The partitioned parquet reading now has a different approach for pushdown filters. For details check the tutorial
๐ New Functionalities
- ๐ง Global configuration module - TUTORIAL
- Concurrently partitions write - TUTORIAL
- Flexible Partitions Filter (PUSH-DOWN) - TUTORIAL
- Add Athena query metadata to Pandas DataFrames returned by
wr.athane.read_sql_*()
- TUTORIAL #331 wr.athena.describe_table()
#329wr.athena.show_create_table()
#334- Add
path_ignore_suffix
argument to all read functions #326
โจ Enhancements
- ๐ Support for
PyArrow 1.0.0
#337 - ๐ Support for
Pandas 1.1.0
- ๐ Support writing encrypted redshift copy manifest to S3 #327
wr.athane.read_sql_*()
now accepts empty results #299- ๐ Allow connect_args to be passed when creating an SQL engine from a glue connection #309
- Add
skip_header_line_count
argument towr.catalog.create_csv_table()
#338
๐ Bug Fix
- โ Add missing type annotations and fix types in docstrings. #321
- KeyError: 'StatementType' with Athena using max_cache_seconds #323
wr.s3.read_csv()
slow with chunksize #324wr.s3.read_csv()
with "chunksize" does not forward pandas_kwargs "encoding" #330- Ensure DataFrame mutability for
wr.athane.read_sql_*()
w/ctas_approach=True
#335
๐ Docs
- โก๏ธ Several small updates.
Thanks
๐ We thank the following contributors/users for their work on this release:
@kylepierce, @davidszotten, @meganburger, @erikcw, @JPFrancoia, @zacharycarter, @DavideBossoli88, @c-line, @anand086, @jasadams, @mrtns, @schot, @koiker, @flaviomax, @bryanyang0528, @igorborgest.
_ P.S. _ Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!
_ P.P.S. _ AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).