AWS Data Wrangler/CHANGELOG and AWS Data Wrangler Releases

All Versions

Latest Version

2.0.0

Avg Release Cycle

9 days

Latest Release

1236 days ago

Changelog History

Page 1

v2.0.0 Changes
December 07, 2020
💥 Breaking changes
- sqlalchemy and psycopg2 dependencies replaced by redshift_connector and pg8000
- All wr.db.* functions was distributed into wr.redshift.*, wr.postgresql.* and wr.mysql.* (Tutorial)
- 🔨 Redshift COPY and UNLOAD function was refactored into wr.redshift.* (Tutorial)
- ✅ wr.catalog.get_engine() was replaced by wr.redshift.connect(), wr.postgresql.connect(), wr.mysql.connect() (Tutorial)
🆕 New Functionalities
- ✅ Amazon Timestream support (Tutorial)
✨ Enhancements
- 🐎 General performance improved for s3 I/O removing eventual consistency guardrails (Reference)
- ➕ Add retry with decorrelated jitter for Athena and Glue Catalog calls to overcome throttling in high concurrency scenarios.
📄 Docs
- ⚡️ Updates regarding all new functionalities
- ➕ Add Amazon Timestream tutorial
- ➕ Add Amazon Timestream tutorial 2
AWS re:Invent related news
Thanks

🚀 We thank the following contributors/users for their work on this release:

@Brooke-white, @danielwo, @sapientderek, @pmleveque, @igorborgest.

_ P.S. _ Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
v1.10.1 Changes
November 26, 2020
🆕 New Functionalities
- catalog.add_column() #451
- catalog.delete_column() #451
✨ Enhancements
- Deterministic result for s3.read_parquet_metadata() #449
- 📦 ~30% faster package import time #460
🐛 Bug Fix
- 🛠 Fix Athena read with ctas_approach=False and chunksize=True #458
- 🛠 Fix overwriting for not enforced configs #450
📄 Docs
- 🛠 Small fixes #462 #458 #446
Thanks

🚀 We thank the following contributors/users for their work on this release:

@tuannguyen0901, @bryanyang0528, @czagoni, @jesusch, @danielwo, @DonghanYang, @eric-valente, @igorborgest.

_ P.S. _ Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
v1.10.0 Changes
October 31, 2020
🆕 New Functionalities
- ➕ Add configurable Endpoint URL for AWS services #418
- ➕ Add global environment configuration for Athena workgroups #437
✨ Enhancements
- 👌 Support for Apache Arrow 2.0.0 #436
- Allow Decimal to float casting for wr.db.read_sql_query() #431
- Allow unsafe conversions for wr.db.read_sql_query() #427
🐛 Bug Fix
- QuickSight functions now allow usernames with "/" #434
- 🛠 Fix duplicated carriage return for wr.s3.to_csv() running on Windows platform.
Thanks

🚀 We thank the following contributors/users for their work on this release:

👕 @martinSpears-ECS, @imanebosch, @Eric-He-98, @brombach, @Thomas-Hirsch, @vuchetichbalint, @igorborgest.

_ P.S. _ Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
v1.9.6 Changes
October 10, 2020
✨ Enhancements
- ➕ Add encrypted glue connection management #413
🐛 Bug Fix
- Double carriage return when using \r\n as line terminator (s3.to_csv()) #415
- s3.read_parquet failing with some timezone aware columns #417
Thanks

🚀 We thank the following contributors/users for their work on this release:

@jeanbaptistepriez, @mike-at-upside, @Thiago-Dantas, @igorborgest.

_ P.S. _ Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
v1.9.5 Changes
September 26, 2020
✨ Enhancements
- General exceptions handling improvements #409
- General error messages improvements #409
🐛 Bug Fix
- [Parquet Read] Fix index recovery combined with columns filter #408
📄 Docs
- Handling and documenting ctas_approach for custom data sources #392
Thanks

🚀 We thank the following contributors/users for their work on this release:

@tasq-inc, @igorborgest.

_ P.S. _ Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
v1.9.4 Changes
September 19, 2020
✨ Enhancements
- ➕ Add s3_additional_kwargs for wr.s3.copy_objects() and wr.s3.merge_datasets() #388
- ➕ Add data_source argument for Athena queries #392
- Handling parquet tinyint columns on Redshift loads #400
🐛 Bug Fix
- 🛠 Fix issue with Hive partitions compatibility. #397
- Fix missing catalog_id arguments in partitioned wr.s3.to_parquet() calls #399
- ✂ Remove adaptive retry for boto3 resource. #403
📄 Docs
- ⚡️ Few updates.
Thanks

🚀 We thank the following contributors/users for their work on this release:

@timgates42, @bvsubhash, @DonghanYang, @sl-antoinelaborde, @Xiangyu-C, @tuannguyen0901, @JPFrancoia, @sapientderek, @igorborgest.

_ P.S. _ Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
v1.9.3 Changes
September 08, 2020
🐛 Bug Fix
- 🛠 Fix bug for wr.s3.read_parquet() with timezone offset. #385
Thanks

🚀 We thank the following contributors/users for their work on this release:

@chrisrana, @igorborgest.

_ P.S. _ Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
v1.9.2 Changes
September 07, 2020
🐛 Bug Fix
- 🛠 Fix issues in reading Parquet files with timestamp (timezone aware) columns. #382 #383
Thanks

🚀 We thank the following contributors/users for their work on this release:

@tasq-inc, @chrisrana, @igorborgest.

_ P.S. _ Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
v1.9.1 Changes
September 05, 2020
✨ Enhancements
- Significant Amazon S3 I/O speed up for big files #377
- Create Parquet Datasets with columns with CamelCase names #380
🐛 Bug Fix
- Read Parquet error for some files created by DMS #376
📄 Docs
- ⚡️ Few updates.
Thanks

🚀 We thank the following contributors/users for their work on this release:

@jarretg, @chrisrana, @vikramshitole, @igorborgest.

_ P.S. _ Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
v1.9.0 Changes
September 01, 2020
💥 Breaking changes
- Global configuration s3fs_block_size was replaced by s3_block_size #370
🆕 New Functionalities
- Automatic recovery of Pandas indexes from Parquet files. #366
- Automatic recovery of Pandas time zones from Parquet files. #366
- Optional schema evolution disabling through the new schema_evolution argument. #353
✨ Enhancements
- s3fs dependency was replaced by builtin code. #370
- 🚤 Significant Amazon S3 I/O speed up for high latency environments (e.g. local, on-premises). #370
🐛 Bug Fix
- 👌 Improve NaN handling. #362
- Sanitise table name for partitions insertion #360
📄 Docs
- ⚡️ Few updates.
Thanks

🚀 We thank the following contributors/users for their work on this release:

@isrsal, @bppont, @weishao-aws, @alexifm, @Digma, @samcon, @TerrellV, @msantino, @alvaropc, @luigift, @igorborgest.

_ P.S. _ Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

AWS Data Wrangler changelog

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Changelog History Page 1

💥 Breaking changes

🆕 New Functionalities

✨ Enhancements

📄 Docs

AWS re:Invent related news

Thanks

🆕 New Functionalities

✨ Enhancements

🐛 Bug Fix

📄 Docs

Thanks

🆕 New Functionalities

✨ Enhancements

🐛 Bug Fix

Thanks

✨ Enhancements

🐛 Bug Fix

Thanks

✨ Enhancements

🐛 Bug Fix

📄 Docs

Thanks

✨ Enhancements

🐛 Bug Fix

📄 Docs

Thanks

🐛 Bug Fix

Thanks

🐛 Bug Fix

Thanks

✨ Enhancements

🐛 Bug Fix

📄 Docs

Thanks

💥 Breaking changes

🆕 New Functionalities

✨ Enhancements

🐛 Bug Fix

📄 Docs

Thanks

Changelog History

Page 1